Skip to content

" -file option is deprecated, please use generic option -files instead." #79

@UMDTERPS

Description

@UMDTERPS

Hello! I am trying to run a job for our data team and we are getting errors using dumbo. We are using the latest version of Dumbo and Cloudera.

Command used to run the job:

"ls[benjamin@arya dedup]$ dumbo start jaccard.py -input products -output products-output13 -hadoop /usr/ -hadooplib /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/"

Stacktrace:

13/10/30 13:05:32 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
13/10/30 13:05:32 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.
packageJobJar: [/home/benjamin/mapreduce/jobs/dedup/typedbytes.pyc, /home/benjamin/mapreduce/jobs/dedup/jaccard.py, /home/benjamin/mapreduce/jobs/dedup/dumbo/backends/common.pyc] [] /tmp/streamjob5478521893861821465.jar tmpDir=null
13/10/30 13:05:33 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/10/30 13:05:34 INFO mapred.FileInputFormat: Total input paths to process : 1
13/10/30 13:05:35 INFO mapred.JobClient: Running job: job_201310231818_0015
13/10/30 13:05:36 INFO mapred.JobClient: map 0% reduce 0%
13/10/30 13:05:47 INFO mapred.JobClient: Task Id : attempt_201310231818_0015_m_000000_0, Status : FAILED
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1649)
at org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:620)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:394)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.streaming.AutoInputFormat not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1617)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1641)

Any help would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions