We ran the security check on the application and we got to know that SDWebImage uses weak hash MD5 which is a risky cryptographic algorithm.

SD does not use cryptographic.
It's a Cache.
Does Attacker try to get your local image cache do any meaning ? You can even rename the image file name into Original URL, so what issue caused ?
Thanks for the quick reply,
We have so far scanned the code to see the potential聽security threats to the application and my application uses _SDWebImage_. The scan says that "SDWebImageCache.m" file uses the MD5 cryptographic algorithm which is a broken or risky algorithm.
Could you please elaborte the meaning of "You can even rename the image file name into Original URL", I am not sure about this.
If _SDWebImage_ does not use cryptographic, why do we have聽MD5 cryptographic algorithm use in "SDWebImageCache.m"?
Thanks
Because you're not allowed have a filename with "/", "'", ";", etc.
But URL can end with them.
So we need hash.
Simple answer.
You can just ignore that wrong detect report.
SD does not use any cryptographic and don't need to use whether MD5 or SHA1 or anything hash function. Use MD5 is because of history code and it's fast and simple.
So you mean to say that SDWebImage never uses cryptographic, we were using 4.4.8 version, could this version cause this warning, any idea?
Now we have updated the SDK to 5.7.3.
What do you mean by this:
Use MD5 is because of history code and it's fast and simple.
Thanks
@aviraj-ios You haven't understood what the defect report told you. The defect report told you that the MD5 hashing algorithm, once presumed to be a strong one-way hash, is no longer a strong one-way hash: it can be reversed. That means if you are encoding information _for the purpose of protecting it from clear view_, also known as cryptography, then MD5 is an attack vector ('bad'). This has been public knowledge for a long time.
What should be very clear to you if you think about it is that this library does not intend to protect this information (the filename) -- because the filename was never intended to be private. Think about it: what private information does this framework have access to, or want to protect?
What @dreampiggy has told you several times is the same thing: SDWebImage doesn't perform cryptography because it isn't protecting this information. You could simply attach a debugger, read out the memory, and see the filename of the image before the hash fires -- so what?
Hashes are often used in non-cryptographic ways, such as serializing data, or changing character sets of data to exclude illegal chars -- which is exactly what @dreampiggy said.
Next time please take the time to understand what the defect report you received means, before you file a bug.
We have a similar issue at my company, wherein our security team has scanned our codebase with an automated tool and wants us to remove any use of MD5, regardless of whether it's used for cryptographic purposes or even called by our code. Non-cryptographic use significantly lowers the priority, but they still want it removed if possible, and exceptions to this require high-level approval.
So while I completely understand the reasoning for this being a legitimate no-risk use of MD5, you may continue to run into folks who are obligated by their company to ask for its removal regardless. Is there any chance the overhead of addressing those would make it worth updating to CC_SH256? I realize backwards compatibility is an issue. Could it maybe be an opt-in thing? (A bit messy, I know.)
IMO if corps are not willing to correctly handle valid usage of md5 then they should be willing to put up the resources (people, cash, whatever) necessary and submit a PR, or more likely in this case, maintain their own fork that has whatever changes they like. I don't see why the open-source community should spend time on this as it is neither a feature nor a bug, just some corporate fancy: why should corps receive custom work for free?
Just my opinion ... I run a small software corporation and I am not a contributor to this project.
Again.
MD5 is hashing function, not cryptographic.
The usage of MD5 here, is just because URL can not save as filename, because macOS or All POSIX system, does not allows / as a file name character. We need a transform for that.
If you want, we can change the hashing function in v6.0.0. Whatever SHA1, SHA256, etc.
But anyway, we can not do this in current 5.x, because this is a major behavior break which effect all users. Not only you and your company. As a framework author, I should take respond to all the user from the world.
Next major version makes sense. Thanks for considering it, @dreampiggy. The fact that Apple deprecated CC_MD5 might make it slightly more worthwhile in the long run too, although I'm sure it'll be a pretty long time before they really remove it.
@xaphod Some very valid points there. I will pass the sentiment along to my fanciful counterparts.
URL type has an instance property hashValue which return Int value which can be turned to String and stored the unique identificator in Cache. This eliminates need of cryptographic tools and simplify the code base
Apple does not provide any gurantee that hashValue will always be valid and stable, across iOS versions. If they change the implementation across to iOS versions, user upgrade from old iOS version will loss cache in current App.
And, they does not gurantee that hashValue will be a valid File Name. Our custom MD5 logic can gurantee. See the reason above.
Apple does not provide any gurantee that
hashValuewill always be valid and stable, across iOS versions. If they change the implementation across to iOS versions, user upgrade from old iOS version will loss cache in current App.And, they does not gurantee that
hashValuewill be a validFile Name. Our custom MD5 logic can gurantee. See the reason above.
If they call it hashValue that means they gurantee a minimum quality level of their hasher. MD5 has collisions as well.
If keeping a cash between new releases is a feature and important one then instead using any security hashers it's better to have something internal maybe one which use base64 or base32 strings.
If you will choose to take sha256 in next few years you will get a new ticket with the same issue: My corporation IT department says that sha256 is weak now and not allowed in our code base.
@Insofan That #3069 should be re-considered. I think use custom hashing function, not SHA256, which can allows our SDWebImage's use case:
File Path compatible StringI think use custom hashing function, not SHA256, which can allows our SDWebImage's use case:
Are you sure about implementing a custom hash function? At the first glance it sounds like a workaround to avoid these "security" messages from corporations.
As for me, I believe that changing MD5 to SHA256 would be pretty helpful for at least 2 reasons:
1) No more Open Issues about "security" until SHA256 is not deprecated.
2) SHA256 has lower chances of collision compared to MD5. Having collisions in this case is unwanted because this hash value is used for generating in file names. It means that if a collision happens then user might see a wrong image.
@Adobels Do you have any good idea or algorithm about a custom hash function.
This subject is not related to corporations only but more about those who use security audit tools. GitLab added in the release 13.5 (22-octobre-2020) a free tool for mobile security scanning. So Gitlab clients for example a startups are concerned as well.
There is an encoding "Base 64 Encoding with URL and Filename Safe Alphabet" which can be used to get over problems with special characters in a file names which would collide with those used by Linux or Darwin to annotate directories, schemas and others. Other problem which is solved by Base64URL encoding is a stability of the cache content through the SDWebImages releases. If the SHA256 will be chosen then when it will be announced that SHA256 is weak a transition to SHA512 will one more time invalidate a cache content (@dreampiggy stated that the cache content stability is a feature of the framework "If they change the implementation across to iOS versions, user upgrade from old iOS version will loss cache in current App.").
So, my proposition is to take a look on the encoding "Base 64 Encoding with URL and Filename Safe Alphabet" which can be an universal solution for Linux, Darwin and other platforms which will be supported by Swift in feature.
There is an algorithm which translates from Base64 to Base64URL which was written by Martin-R, one of the top users on StackOverflow and published on StackOverflow
@Adobels sounds great!
I can implement it the idea is approved. Do we need this idea (Base64URL) to be approved before implementing it or it's better to open PR ASAP?
I can implement it the idea is approved. Do we need this idea (Base64URL) to be approved before implementing it or it's better to open PR ASAP?
@dreampiggy?
"Base 64 Encoding with URL and Filename Safe Alphabet"
This is one great solution. We can use this to avoid file system issue.
However, some of my personal idea: I'm afraid this may cause Implicit Interfaces. Base64 encoding can be decoded to get the original URL.
So as Base64, some of user may think OK, I can grab the cache file name to get what's these images URL come from. This is not what SDWebImage promote or gurantee to be usable and may break in future like 7.x. But anyway, you can ignore these cases..
which can be an universal solution for Linux, Darwin and other platforms which will be supported by Swift in feature.
SDWebImage does not have clear plan to release on Linux or Windows platform. We're bined to Apple's framework. Only if Apple port the UIKit and ImageIO on Linux/Windows platform, can we have the capabilities to support these platforms..Swift itself as a language can does not mean as a Apple's framework wrapper we can :(
@Adobels I have update this pr.Please have a look.
Most helpful comment
We have a similar issue at my company, wherein our security team has scanned our codebase with an automated tool and wants us to remove any use of MD5, regardless of whether it's used for cryptographic purposes or even called by our code. Non-cryptographic use significantly lowers the priority, but they still want it removed if possible, and exceptions to this require high-level approval.
So while I completely understand the reasoning for this being a legitimate no-risk use of MD5, you may continue to run into folks who are obligated by their company to ask for its removal regardless. Is there any chance the overhead of addressing those would make it worth updating to CC_SH256? I realize backwards compatibility is an issue. Could it maybe be an opt-in thing? (A bit messy, I know.)