So the upshot is that "fine-grained" means that the classes that are trying to be distinguished between are visually very similar. In terms of a single image, it means that it is hard to see the differences across the entire image. In that case, your top image is fine-grained. The dog and images are not. Though if you zoomed into the hair ...

