I've done an implementation of this in Unity with Vuforia and Region Capture (search github). But here would be the steps in opencv.
- Detect the marker which is the B/W coloring page (something like sturkmen's answer might work)
- From detecting the marker you know the coordinates of where in the camera stream the page is
- Using the four corners, cutout and unwarp the colored page from the camera stream so it fills a square
- Use this square image as a the texture for your UV map on your 3d model
- Display 3d model on marker
The key is the UV map of the model has to somewhat match the actual coloring page. You'll have to do some tricks to get the backside of the model, like doubling up with the front side of the UV map in that area. Search UV map on google images to get an idea of what they are if you don't know. Good luck.
Closed down for obvious reasons. (edit) only reopened because @Raki was so nice to give you some pointers. Keep in mind that we expect more from a question :).