Skip to content

Set reducer‘s numbers failed #88

@kitein9t

Description

@kitein9t

I am using Hadoop streaming with -io typedbytes and set mapred.reduce.tasks=2, but I finally got only one output file. And if I set mapred.reduce.tasks=0, then I got many output files. I am very confused.

SO my question is:
How to make mapred.reduce.tasks = num (num >1) config valid when I using -io typedbytes in streaming?

PS: my mapper's output is (key:string of python, value:array of numpy) .
And my .sh file:
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-1.2.1.jar
-D mapred.reduce.tasks=2
-fs local
-jt local
-io typedbytes
-inputformat org.apache.hadoop.mapred.SequenceFileAsBinaryInputFormat
-input FFT_SequenceFile
-output pinvoutput
-mapper 'pinvmap.py'
-file pinvmap.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions