I can't give you an answer with a Paper or sth. similar (I don't even know, if someone tested sth. like this). But I can tell you what are the advantages or disadvantages of a small or a high Image size.
You probably already know, that SIFT is not the fastest existing Feature Descriptor which exists. If you want to accelerate the Recognition, you can try to get smaller Image Sizes. The processing will be by far better than. For Example, in an Android App I used a combined FAST Detector and SIFT Descriptor algorithm (so a already speeded up style of SIFT). The Initial Size I used, was 3264 x 2448 (so a large resolution for a smartphone) The Algorithm for training an Image runned in ~10 seconds (on a Nexus 5 device). After that I tested the 1920 x 1080 Resolution on the sam device and the training of the image runned in ~4 seconds.
So if you want to process the recognition or training as fast as possible, you should use a small Image size.
But, if you take a small Image size, you will probably get a smaller Image Pyramid, which also means, that you will have a smaller range of scales that you can provide with your Algorithm. e.g. my created Pyramid with the 3264 x 2448 resolution was slower to build, but it also had more Images in the Pyramid, so my Object could be detected from a bigger range than it could with the image pyramid builded by the 1920 x 1080 resolution Image.
So if you want to support a large set of distances ( scales ) it is better if you use a large Image Size, which will result in a slower processing. If you want a fast recognition and training of a Image you should pick a smaller Size.
What would be a good combination is: Use a rather large Image to train your Object for the recognition, thus you gain a good range, in which the Image could be detected, and for recognition use a smaller Image (but not too small!) which you compare to your reference Image so you get a good performance while you want to track your object.
imho, you can't go smaller, than the internal patch size (32,iirc), but from then on, it's only: the larger, the slower.