Formats

Various data formats have their own quirks.

A couple of things to look out for:

  1. Is the data validated before or after unicode normalization?

  2. Is any of the data part of a turing complete configuration language?

CSV

Even something this simple has multiple dialects and varieties. Mess around with whitespace and separators and quotes.

IP address

Twitter threadarrow-up-right Another twitter threadarrow-up-right Blog postarrow-up-right Another Blog postarrow-up-right Toolarrow-up-right

$ ping 0177.1
$ ping 134744072
$ ping 0x8080808
$ ping 010.0x0000008.00000010.8
$ ping 8.0x0000000000000080808
$ ping 192.168.36095
$ ping 192.11046143
$ ping 0000000001.0000000002.0000000003.000000004

There are many less-common IP address formats. Try them.

JSON

Tell me morearrow-up-right

Different parsers deal with special cases differently. Perhaps if an app uses one library for parsing and a different one for parsing+validation, you can bypass validation by duplicating keys?

Magic Bytes

List ordered by magicarrow-up-right

PDF

Insecure PDF featuresarrow-up-right

Polyglots

Tell me more Mitraarrow-up-right

They can use polyglots to hide their code. We can use polyglots to bypass type checks.

Yes, officer, that's a GIF I'm uploading. Oh, but it's also PHP code.

Regular Expressions

Toolarrow-up-right

Ranges

[A-z] includes more than just letters.

Terminals

If, for some reason, your payload is inspected in an actual real-life terminal (-emulator), you may want to try Terminal Escape Injectionarrow-up-right. Include escapes in your payload such that a terminal will overwrite the sensitive text with benign-looking data.

URL filtering

Examplearrow-up-right If a regex is used to split the host, perhaps URL parsing can be fooled. Real parsers take everything before an @ as a username.

Unicode

Normalization

Tell me morearrow-up-right Listarrow-up-right Toolarrow-up-right Toolarrow-up-right Toolarrow-up-right

(Sourcearrow-up-right)

Unknown encodings

Toolarrow-up-right to auto-detect layered text transforms Cyberchefarrow-up-right is good for messing around with known chains Haitiarrow-up-right identifies hash types

URL

Domain namesarrow-up-right can't contain underscores, except that subdomains certainly can. Neat.

Domain name filters can sometimes be bypassed by unicode normalization exploits (see above).

Not all languagesarrow-up-right parse URLs identically.

Some platforms (NodeJS) are more permissive about URL formats:

Also try the tricks listed under IP address

XML

Different XML-based formats have their own cavities. Can the schema be bypassed to include dangerous elements?

XXE

Tell me morearrow-up-right Tell me even morearrow-up-right You guessed itarrow-up-right Toolarrow-up-right Work on itarrow-up-right What about dirty files?arrow-up-right

Can the XML "import" external resources?

Last updated