

provide the hash of an arbitrarily large file and retrieve it from the network
I sense an XY Problem scenario. Can you explain what you’re seeking to ultimately build and what requirements you have?
Does the solution need to be distributed? Does the retrieval need to complete ASAP or can wait until data becomes available? What sort of reliability/availability does this need? If only certain hash algorithms can be supported, which ones do you need and why?
I ask this because the answer will be drastically different if you’re building the content distribution system for a small video game versus building the successor to Kim Dotcom’s Mega file-sharing service.
Absolutely. An example of a malicious collision would be to request the file with the SHA-1 of 38762cf7f55934b34d179ae6a4c80cadccbb7f0a. But… there’s two of them here.
MD5 is so broken that its former status as a cryptographic hash function has been stripped. And efforts are underway to replace SHA-1 where it’s used, since although it takes some prerequisites to intentionally create a SHA-1 collision today, it’s worth remembering that “attacks always get better, they never get worse”.