[Hadoop] Shuffle & Sorting of MapReduce Task

최대 1 분 소요

What is Shuffling?

The Mapper creates the intermediate key-value pair and transfer them to the Reducer task. This procedure is known as Shuffling.
Using the Shuffling procedure the system can sort the data using key values.
The shuffling task begins when some of the mapping tasks are done. So this is the faster process. It will not wait for completion of the mapper task.

What is Sorting?

The MapReduce Framework automatically sorts the data on key values on the output of Mapper. So before sending it to the reducer, all key-value pairs are sorted.
The Reducer can easily understand when a new reducing task will be started by the sorted key-value pairs.
If the user set no reducer task, the shuffling and sorting phase will not take place. The task will over after the mapper task.

공유하기

Twitter Facebook LinkedIn

댓글남기기

참고

[TIL] 불리언 / 부동소수점

10/03/2022 TIL

1 분 소요

Today I Learned

[TIL] 전역변수 / singed와 unsigned / 정수 오버플로우, 언더플로우

10/02/2022 TIL

1 분 소요

Today I Learned

[TIL] 스택 메모리 사용법 / 주의사항

10/01/2022 TIL

최대 1 분 소요

Today I Learned

[TIL] 스택 메모리의 필요성

09/29/2022 TIL

1 분 소요

Today I Learned