Formats
Last updated
Last updated
Various data formats have their own quirks.
A couple of things to look out for:
Is any data parsed ?
Is the data validated before or after unicode normalization?
Is any of the data part of a turing complete configuration language?
Even something this simple has multiple dialects and varieties. Mess around with whitespace and separators and quotes.
There are many less-common IP address formats. Try them.
Different parsers deal with special cases differently. Perhaps if an app uses one library for parsing and a different one for parsing+validation, you can bypass validation by duplicating keys?
They can use polyglots to hide their code. We can use polyglots to bypass type checks.
Yes, officer, that's a GIF I'm uploading. Oh, but it's also PHP code.
[A-z]
includes more than just letters.
Domain name filters can sometimes be bypassed by unicode normalization exploits (see above).
Some platforms (NodeJS) are more permissive about URL formats:
Also try the tricks listed under IP address
Different XML-based formats have their own cavities. Can the schema be bypassed to include dangerous elements?
Can the XML "import" external resources?
If, for some reason, your payload is inspected in an actual real-life terminal (-emulator), you may want to try . Include escapes in your payload such that a terminal will overwrite the sensitive text with benign-looking data.
If a regex is used to split the host, perhaps URL parsing can be fooled. Real parsers take everything before an @
as a username.
()
to auto-detect layered text transforms is good for messing around with known chains identifies hash types
can't contain underscores, except that subdomains certainly can. Neat.
parse URLs identically.