# EAST dnn.forward Assertion failed

I wish to use the EAST text detector using Python (on Windows 10 with 16 GB RAM) and following this tutorial. However when working with some images it fails systematically with the following error:

cv2.error: OpenCV(4.0.0) C:\projects\opencv-python\opencv\modules\dnn\src\dnn.cpp:835: error: (-215:Assertion failed) ld.inputBlobs[0]->total() == total(shapes[index]) in function 'cv::dnn::dnn4_v20180917::BlobManager::allocateBlobsForLayer'


I cannot tell what characteristics make an image fail or not. Here is the code to reproduce the error (and an example of a troublesome image can be downloaded from here ...it's a pretty big image, i.e., width = 30253 and height = 4537):

# sFileName is the path to the image, previously set
aiShape = oInputImage.shape
(iH, iW) = aiShape[:2]
iRequiredUnit = 32

# check if the image height is enough
iHr = iH % iRequiredUnit
iBottom = 0
iHr = iH % iRequiredUnit
if 0 < iHr:
# calculate how much padding is necessary
iBottom = iRequiredUnit - iHr

# check if the image width is enough
iRight = 0
iWr = iW % iRequiredUnit
if 0 < iWr:
# calculate how much padding is necessary
iRight = iRequiredUnit - iWr

if iBottom > 0 or iRight > 0:
oImage = cv.copyMakeBorder(
src=oInputImage,
top=iTop,
bottom=iBottom,
left=iLeft,
right=iRight,
borderType=cv.BORDER_CONSTANT,
value=[0, 0, 0]
)
else:
oImage = oInputImage.copy()

(iH, iW) = oImage.shape[:2])

ib, ig, ir, _ = cv.mean(oImage)
oBlob = cv.dnn.blobFromImage(
oImage, 1.0, (iW, iH), (ib, ig, ir),
swapRB=True, crop=False
)

# EAST_path initialized appropriately previously
oNet.setInput(oBlob)
asLayerNames = [
"feature_fusion/Conv_7/Sigmoid",
"feature_fusion/concat_3"]
(afScores, aoGeometry) = oNet.forward(asLayerNames)


The last line causes the error.

begin EDIT 0

I tried following the example kindly indicated by berak but with no success, since most of the relevant differences take place after the call of net.forward. I also tried providing smaller sizes in blobFromImage to avoid the need for the copyMakeBorder part, but if the newly specified size is too small the detection has a lower precision (tried on other images too).

end EDIT 0

I already posted a request on stack overflow (sorry for cross posting), but I received no answer and I was about to open an issue on github, but there I was invited to post here first.

begin EDIT 1

I made some experiments to see how memory occupation could affect the execution. Here are some results, to be read as follows:

• rows:
• first and second row show the height and width passed as parameters to the openCV example
• next rows contain the memory occupation at various stages of execution
• columns:
• the first column states the phase at which memory was measured
• the remaining ones show the memory occupation by the program, expressed in GB. MIssing values are for those cases where EAST crashed with the error reported above
1. height 320 640 1024 2048 4096 8192 4096 4544
2. width ...
edit retag close merge delete

Sort by » oldest newest most voted

the pyimagesearch example is outdated, do not use that, but opencv's own python sample

then, it's unclear, why you need the copyMakeBorder() code, but the final input has to be a multiple of 32 for W and H

more

2

Thanks for the answer and the link, I was unaware of the example. The decoding and NMS parts look different, nevertheless they take place after calling net.forward, which is where I am experiencing issues. Up to that line the code is mostly the same, and if I try it, I still get the same error.

Let me add some more details. The image is pretty big: width = 30253 and height = 4537. Those are not multiple of 32 and thus must be resized or I get

(-201:Incorrect size of input array) Inconsistent shape for ConcatLayer in function 'cv::dnn::ConcatLayerImpl::getMemoryShapes'

(continues in the next comment..)

( 2019-05-15 10:11:03 -0500 )edit

Let r be the rest of the integer division between one of the dimensions and 32, and let d = 32 - r. Starting from the example, I tried the following:

1. leave the program arguments as default (i.e., 320 x 320): it works (or, at least, it doesn't fail the assertion). However resizing the blob so much causes a sensible degradation of the performance. I tried with multiples of 320 and the higher the better (and the longer to complete). Same with other images
2. set the actual dimensions minus r (i.e., 30240 x 4512): the execution fails with the usual error
3. set the actual dimensions plus d (i.e., 30272 x 4544): the execution fails with the usual error

(continues in the next comment...)

( 2019-05-15 10:11:30 -0500 )edit
2
1. since for the application I am working on I can afford a lot of computation time, but precision is required, I whish to execute the text detection at full size to get the highest accuracy, which is why I came up with the copyMakeBorder() code. I tried putting it into the example and, es expected: the execution fails with the usual error

Hints at how to investigate it?

( 2019-05-15 10:11:57 -0500 )edit
1

i think, it can be reproduced with any image, given the problematic sizes.

btw, i'm running out of ram on colab with W > 32*500 already. can you check your mem usage ?

( 2019-05-16 03:31:10 -0500 )edit
1

Good point. However that is not the case, as out of 37 test images of comparable size, the problem happens only with 4 of them, so I guess the size itself might be not the only reason. Nevertheless, if I cannot get a solution in any other way, I might try to scale the image down.

( 2019-05-16 03:40:14 -0500 )edit
2

I did some additional experiments and updated the question with the results. Short answer: memory occupation is relevant but it looks like it is not the reason why the execution fails. Any other ideas?

( 2019-05-21 08:18:28 -0500 )edit

Official site

GitHub

Wiki

Documentation