Nyhetsflöde

Logga in till din kurswebb

Du är inte inloggad på KTH så innehållet är inte anpassat efter dina val.

Har du frågor om kursen?

Om du är registrerad på en aktuell kursomgång, se kursrummet i Canvas. Du hittar rätt kursrum under "Kurser" i personliga menyn.

Är du inte registrerad, se Kurs-PM för IS2202 eller kontakta din studentexpedition, studievägledare, eller utbilningskansli.

I Nyhetsflödet hittar du uppdateringar på sidor, schema och inlägg från lärare (när de även behöver nå tidigare registrerade studenter).

Maj 2014

Here is an interesting course I can recommend:

Introduction to High-Performance Computing

PDC Summer School
KTH Royal Institute of Technology, Stockholm, Sweden

August 18-29, 2014

http://www.pdc.kth.se/education/summer-school

Anmäl missbruk

Lärare Mats Brorsson skrev inlägget 16 maj 2014

Visa fler liknande händelser (2)

Maj 2012

Given that:

*A GPU contains multiple SIMD processors

*Each SIMD processor contains multiple lanes.

*Each SIMD processor is assigned a single thread block (by the thread block scheduler)

The question is which one of these two alternatives is correct:

-Alt1 (parallel execution of threads): Each lane runs a single thread among all threads in the thread block -> to completely become executed, each thread takes as many clock cycles as there is elements in the vector that it writes to/reads from

-Alt2 ("sequential-alternating" execution of threads): Each thread occupies all lanes in a single SIMD processor -> each thread takes round_up(<nr_of_elements_in_the_vector>/<nr_of_lanes_per_SIMD_processor>) clock cycles to finish execution (not necessary consecutive) -> the thread scheduler (in each SIMD processor) schedules/alternates between different threads even if a single thread didn't finish all its cycles. So threads doesn't execute in parallel

(PS. Alt1 is what I understood from the GPU class/slides; Alt2 is what I understood from the book)

Anmäl missbruk

Oussama Chammam skrev inlägget 2 maj 2012

Oussama Chammam taggade med SIMD, thread, thread block, thread scheduling och lane. 2 maj 2012

Oussama Chammam kommenterade 3 maj 2012

Alt1 is the correct alternative. The book was a bit uncleare about that I think, or maybe I have missed something on it; but the slides are anyhow more cleare with more figures.

Thank you Artur for answering the question and for the slides.