Ask Your Question

Revision history [back]

OpenCV using RDD data where Image converted SequeceFile in pySpark

Let me explain my experiment. I have convert the image files to sequencefile(SequenceWritable). Using java i.e from local drive to hadoop(HDFS) file. And trying to read this sequencefile from hadoop using pySpark. Here I am able to load the data in RDD.

If trying to use this RDD with OpenCV function could not able to compile. I need help on this.

code eg:

import cv2 import numpy as np imageRdd = sc.sequenceFile("/user/GR5017759/Retinopathy/OutputSeq") R = cv2.imdecode(np.asarray(bytearray(imageRDD), dtype=np.uint8)

===================================

error:

TypeError: 'RDD' object is not iterable If you have any idea on this please help me.