An image rejected by a multimodal API almost always has a mundane cause. Format, path versus base64, size or a misdeclared media_type. An ordered diagnosis method.
A multimodal API that rejects an image often returns a vague message. Yet the real causes are few and easy to isolate. We check them in a fixed order rather than at random.
Format, path and encoding
Start with the format. Many APIs accept only a subset. PNG, JPEG, sometimes WebP. An exotic format or a badly converted file fails silently. Check the real extension, not the one in the name.
Next comes the path versus base64 confusion. Some APIs want a path or a URL, others the content base64-encoded in the request body. Sending a local path where the API expects base64 produces an opaque error. Read the spec of the exact field.
Size and media_type
Size is a frequent trap. Beyond a limit in bytes or in pixels, the image is rejected. Resize or recompress before sending, rather than waiting for the server-side error.
Finally the media_type. Declaring image/png for a JPEG breaks decoding on the API side. The media_type must match the real binary content, not a guess. The principle to keep. On a rejected image, walk through format, encoding, size, then media_type before suspecting the model.