项目作者: ajinabraham

项目描述 :
Generic SAST Library
高级语言: Python
项目地址: git://github.com/ajinabraham/libsast.git
创建时间: 2020-04-09T23:56:22Z
项目社区:https://github.com/ajinabraham/libsast

开源协议:GNU Lesser General Public License v2.1

下载


libsast

Generic SAST for Security Engineers. Powered by regex based pattern matcher and semantic aware semgrep.

Made with Love in India Tweet

PyPI version
platform
License
python
Build

Support libsast

  • Donate via Paypal: Donate via Paypal
  • Sponsor the Project: Github Sponsors

Install

  1. pip install semgrep==1.86.0 #For semgrep support
  2. pip install libsast

Pattern Matcher is cross-platform, but Semgrep supports only Mac and Linux.

Command Line Options

  1. $ libsast
  2. usage: libsast [-h] [-o OUTPUT] [-p PATTERN_FILE] [-s SGREP_PATTERN_FILE]
  3. [--sgrep-file-extensions SGREP_FILE_EXTENSIONS [SGREP_FILE_EXTENSIONS ...]]
  4. [--file-extensions FILE_EXTENSIONS [FILE_EXTENSIONS ...]]
  5. [--ignore-filenames IGNORE_FILENAMES [IGNORE_FILENAMES ...]]
  6. [--ignore-extensions IGNORE_EXTENSIONS [IGNORE_EXTENSIONS ...]]
  7. [--ignore-paths IGNORE_PATHS [IGNORE_PATHS ...]]
  8. [--show-progress] [--cpu-core CPU_CORE] [-v]
  9. [path ...]
  10. positional arguments:
  11. path Path can be file(s) or directories
  12. options:
  13. -h, --help show this help message and exit
  14. -o OUTPUT, --output OUTPUT
  15. Output filename to save JSON report.
  16. -p PATTERN_FILE, --pattern-file PATTERN_FILE
  17. YAML pattern file, directory or url
  18. -s SGREP_PATTERN_FILE, --sgrep-pattern-file SGREP_PATTERN_FILE
  19. sgrep rules directory
  20. --sgrep-file-extensions SGREP_FILE_EXTENSIONS [SGREP_FILE_EXTENSIONS ...]
  21. File extensions that should be scanned with semantic
  22. grep
  23. --file-extensions FILE_EXTENSIONS [FILE_EXTENSIONS ...]
  24. File extensions that should be scanned with pattern
  25. matcher
  26. --ignore-filenames IGNORE_FILENAMES [IGNORE_FILENAMES ...]
  27. File name(s) to ignore
  28. --ignore-extensions IGNORE_EXTENSIONS [IGNORE_EXTENSIONS ...]
  29. File extension(s) to ignore in lower case
  30. --ignore-paths IGNORE_PATHS [IGNORE_PATHS ...]
  31. Path(s) to ignore
  32. --show-progress Show scan progress
  33. --cpu-core CPU_CORE No of CPU cores to use. Use all cores by default
  34. -v, --version Show libsast version

Example Usage

  1. $ libsast -s tests/assets/rules/semantic_grep/ -p tests/assets/rules/pattern_matcher/ tests/assets/files/
  2. {
  3. "pattern_matcher": {
  4. "test_regex": {
  5. "files": [
  6. {
  7. "file_path": "tests/assets/files/test_matcher.test",
  8. "match_lines": [
  9. 28,
  10. 28
  11. ],
  12. "match_position": [
  13. 1141,
  14. 1149
  15. ],
  16. "match_string": ".close()"
  17. }
  18. ],
  19. "metadata": {}
  20. },
  21. "test_regex_and": {
  22. "files": [
  23. {
  24. "file_path": "tests/assets/files/test_matcher.test",
  25. "match_lines": [
  26. 3,
  27. 3
  28. ],
  29. "match_position": [
  30. 52,
  31. 66
  32. ],
  33. "match_string": "webkit.WebView"
  34. },
  35. {
  36. "file_path": "tests/assets/files/test_matcher.test",
  37. "match_lines": [
  38. 7,
  39. 7
  40. ],
  41. "match_position": [
  42. 194,
  43. 254
  44. ],
  45. "match_string": ".loadUrl(\"file:/\" + Environment.getExternalStorageDirectory("
  46. }
  47. ],
  48. "metadata": {}
  49. },
  50. "test_regex_and_not": {
  51. "files": [
  52. {
  53. "file_path": "tests/assets/files/test_matcher.test",
  54. "match_lines": [
  55. 42,
  56. 42
  57. ],
  58. "match_position": [
  59. 1415,
  60. 1424
  61. ],
  62. "match_string": "WKWebView"
  63. },
  64. {
  65. "file_path": "tests/assets/files/test_matcher.test",
  66. "match_lines": [
  67. 40,
  68. 40
  69. ],
  70. "match_position": [
  71. 1363,
  72. 1372
  73. ],
  74. "match_string": "WKWebView"
  75. }
  76. ],
  77. "metadata": {}
  78. },
  79. "test_regex_and_or": {
  80. "files": [
  81. {
  82. "file_path": "tests/assets/files/test_matcher.test",
  83. "match_lines": [
  84. 50,
  85. 50
  86. ],
  87. "match_position": [
  88. 1551,
  89. 1571
  90. ],
  91. "match_string": "telephony.SmsManager"
  92. },
  93. {
  94. "file_path": "tests/assets/files/test_matcher.test",
  95. "match_lines": [
  96. 58,
  97. 58
  98. ],
  99. "match_position": [
  100. 1973,
  101. 1988
  102. ],
  103. "match_string": "sendTextMessage"
  104. }
  105. ],
  106. "metadata": {}
  107. },
  108. "test_regex_multiline_and_metadata": {
  109. "files": [
  110. {
  111. "file_path": "tests/assets/files/test_matcher.test",
  112. "match_lines": [
  113. 52,
  114. 52
  115. ],
  116. "match_position": [
  117. 1586,
  118. 1684
  119. ],
  120. "match_string": "public void onRequestPermissionsResult(int requestCode,String permissions[], int[] grantResults) {"
  121. },
  122. {
  123. "file_path": "tests/assets/files/test_matcher.test",
  124. "match_lines": [
  125. 10,
  126. 11
  127. ],
  128. "match_position": [
  129. 297,
  130. 368
  131. ],
  132. "match_string": "public static ForgeAccount add(Context context, ForgeAccount account) {"
  133. }
  134. ],
  135. "metadata": {
  136. "cwe": "CWE-1051 Initialization with Hard-Coded Network Resource Configuration Data",
  137. "description": "This is a rule to test regex",
  138. "foo": "bar",
  139. "masvs": "MSTG-STORAGE-3",
  140. "owasp-mobile": "M1: Improper Platform Usage",
  141. "owasp-web": "A10: Insufficient Logging & Monitoring",
  142. "severity": "info"
  143. }
  144. },
  145. "test_regex_or": {
  146. "files": [
  147. {
  148. "file_path": "tests/assets/files/test_matcher.test",
  149. "match_lines": [
  150. 26,
  151. 26
  152. ],
  153. "match_position": [
  154. 1040,
  155. 1067
  156. ],
  157. "match_string": "Context.MODE_WORLD_READABLE"
  158. }
  159. ],
  160. "metadata": {}
  161. }
  162. },
  163. "semantic_grep": {
  164. "errors": [
  165. {
  166. "code": 3,
  167. "level": "warn",
  168. "message": "Semgrep Core WARN - Lexical error in file tests/assets/files/test_matcher.test:40\n\tunrecognized symbols: !",
  169. "path": "tests/assets/files/test_matcher.test",
  170. "type": "Lexical error"
  171. },
  172. ],
  173. "matches": {
  174. "boto-client-ip": {
  175. "files": [
  176. {
  177. "file_path": "tests/assets/files/example_file.py",
  178. "match_lines": [
  179. 4,
  180. 4
  181. ],
  182. "match_position": [
  183. 24,
  184. 31
  185. ],
  186. "match_string": "c = boto3.client(host='8.8.8.8')"
  187. }
  188. ],
  189. "metadata": {
  190. "cwe": "CWE-1050 Excessive Platform Resource Consumption within a Loop",
  191. "description": "boto client using IP address",
  192. "owasp-web": "A8: Insecure Deserialization",
  193. "severity": "ERROR"
  194. }
  195. }
  196. }
  197. }
  198. }

Python API

  1. >>> from libsast import Scanner
  2. >>> options = {'match_rules': '/Users/ajinabraham/Code/njsscan/njsscan/rules/pattern_matcher', 'sgrep_rules': '/Users/ajinabraham/Code/njsscan/njsscan/rules/semantic_grep', 'sgrep_extensions': {'', '.js'}, 'match_extensions': {'.hbs', '.sh', '.ejs', '.toml', '.mustache', '.tmpl', '.jade', '.json', '.ect', '.vue', '.yml', '.hdbs', '.tl', '.html', '.haml', '.dust', '.pug', '.tpl'}, 'ignore_filenames': {'bootstrap.min.js', '.DS_Store', 'bootstrap-tour.js', 'd3.min.js', 'tinymce.js', 'codemirror.js', 'tinymce.min.js', 'react-dom.production.min.js', 'react.js', 'jquery.min.js', 'react.production.min.js', 'codemirror-compressed.js', 'axios.min.js', 'angular.min.js', 'raphael-min.js', 'vue.min.js'}, 'ignore_extensions': {'.7z', '.exe', '.rar', '.zip', '.a', '.o', '.tz'}, 'ignore_paths': {'__MACOSX', 'jquery', 'fixtures', 'node_modules', 'bower_components', 'example', 'spec'}, 'show_progress': False}
  3. >>> paths = ['../njsscan/tests/assets/dot_njsscan/']
  4. >>> scanner = Scanner(options, paths)
  5. >>> scanner.scan()
  6. {'pattern_matcher': {'handlebar_mustache_template': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/ignore_ext.hbs', 'match_string': '{{{html}}}', 'match_position': (52, 62), 'match_lines': (1, 1)}], 'metadata': {'id': 'handlebar_mustache_template', 'description': 'The Handlebar.js/Mustache.js template has an unescaped variable. Untrusted user input passed to this variable results in Cross Site Scripting (XSS).', 'type': 'Regex', 'pattern': '{{{.+}}}|{{[ ]*&[\\w]+.*}}', 'severity': 'ERROR', 'input_case': 'exact', 'cwe': "CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')", 'owasp': 'A1: Injection'}}}, 'semantic_grep': {'matches': {'node_aes_ecb': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/lorem_scan.js', 'match_position': (16, 87), 'match_lines': (14, 14), 'match_string': "let decipher = crypto.createDecipheriv('aes-128-ecb', Buffer.from(ENCRYPTION_KEY), iv);"}], 'metadata': {'owasp': 'A9: Using Components with Known Vulnerabilities', 'cwe': 'CWE-327: Use of a Broken or Risky Cryptographic Algorithm', 'description': 'AES with ECB mode is deterministic in nature and not suitable for encrypting large amount of repetitive data.', 'severity': 'ERROR'}}, 'node_tls_reject': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/skip_dir/skip_me.js', 'match_position': (9, 58), 'match_lines': (9, 9), 'match_string': " process.env['NODE_TLS_REJECT_UNAUTHORIZED'] = '0';"}, {'file_path': '../njsscan/tests/assets/dot_njsscan/skip_dir/skip_me.js', 'match_position': (9, 55), 'match_lines': (18, 18), 'match_string': ' process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";'}], 'metadata': {'owasp': 'A6: Security Misconfiguration', 'cwe': 'CWE-295: Improper Certificate Validation', 'description': "Setting 'NODE_TLS_REJECT_UNAUTHORIZED' to 0 will allow node server to accept self signed certificates and is not a secure behaviour.", 'severity': 'ERROR'}}, 'node_curl_ssl_verify_disable': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/skip_dir/skip_me.js', 'match_position': (5, 11), 'match_lines': (45, 51), 'match_string': ' curl(url,\n\n {\n\n SSL_VERIFYPEER: 0\n\n },\n\n function (err) {\n\n response.end(this.body);\n\n })'}], 'metadata': {'owasp': 'A6: Security Misconfiguration', 'cwe': 'CWE-599: Missing Validation of OpenSSL Certificate', 'description': 'SSL Certificate verification for node-curl is disabled.', 'severity': 'ERROR'}}, 'regex_injection_dos': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/lorem_scan.js', 'match_position': (5, 37), 'match_lines': (25, 27), 'match_string': ' var key = req.param("key");\n\n // Regex created from user input\n\n var re = new RegExp("\\\\b" + key);'}], 'metadata': {'owasp': 'A1: Injection', 'cwe': 'CWE-400: Uncontrolled Resource Consumption', 'description': 'User controlled data in RegExp() can make the application vulnerable to layer 7 DoS.', 'severity': 'ERROR'}}, 'express_xss': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/skip.js', 'match_position': (9, 55), 'match_lines': (7, 10), 'match_string': ' var str = new Buffer(req.cookies.profile, \'base64\').toString();\n\n var obj = serialize.unserialize(str);\n\n if (obj.username) {\n\n res.send("Hello " + escape(obj.username));'}], 'metadata': {'owasp': 'A1: Injection', 'cwe': "CWE-79: Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')", 'description': 'Untrusted User Input in Response will result in Reflected Cross Site Scripting Vulnerability.', 'severity': 'ERROR'}}, 'generic_path_traversal': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/lorem_scan.js', 'match_position': (5, 35), 'match_lines': (36, 37), 'match_string': " var filePath = path.join(__dirname, '/' + req.query.load);\n\n fileSystem.readFile(filePath); // ignore: generic_path_traversal"}, {'file_path': '../njsscan/tests/assets/dot_njsscan/lorem_scan.js', 'match_position': (5, 35), 'match_lines': (42, 43), 'match_string': " var filePath = path.join(__dirname, '/' + req.query.load);\n\n fileSystem.readFile(filePath); // detect this"}], 'metadata': {'owasp': 'A5: Broken Access Control', 'cwe': 'CWE-23: Relative Path Traversal', 'description': 'Untrusted user input in readFile()/readFileSync() can endup in Directory Traversal Attacks.', 'severity': 'ERROR'}}, 'express_open_redirect': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/lorem_scan.js', 'match_position': (5, 26), 'match_lines': (49, 51), 'match_string': ' var target = req.param("target");\n\n // BAD: sanitization doesn\'t apply here\n\n res.redirect(target); //ignore: express_open_redirect'}], 'metadata': {'owasp': 'A1: Injection', 'cwe': "CWE-601: URL Redirection to Untrusted Site ('Open Redirect')", 'description': 'Untrusted user input in redirect() can result in Open Redirect vulnerability.', 'severity': 'ERROR'}}, 'node_deserialize': {'files': [{'file_path': '../njsscan/tests/assets/dot_njsscan/skip.js', 'match_position': (19, 45), 'match_lines': (8, 8), 'match_string': ' var obj = serialize.unserialize(str);'}], 'metadata': {'owasp': 'A8: Insecure Deserialization', 'cwe': 'CWE-502: Deserialization of Untrusted Data', 'description': "User controlled data in 'unserialize()' or 'deserialize()' function can result in Object Injection or Remote Code Injection.", 'severity': 'ERROR'}}}, 'errors': [{'type': 'SourceParseError', 'code': 3, 'short_msg': 'parse error', 'long_msg': 'Could not parse .njsscan as javascript', 'level': 'warn', 'spans': [{'start': {'line': 2, 'col': 20}, 'end': {'line': 2, 'col': 21}, 'source_hash': 'c60298be568bfb1325d92cbb3c0bc1450a25b85bb2e4000bdc3267c05f1c8c73', 'file': '.njsscan', 'context_start': None, 'context_end': None}], 'help': 'If the code appears to be valid, this may be a semgrep bug.'}, {'type': 'SourceParseError', 'code': 3, 'short_msg': 'parse error', 'long_msg': 'Could not parse no_ext_scan as javascript', 'level': 'warn', 'spans': [{'start': {'line': 1, 'col': 3}, 'end': {'line': 1, 'col': 5}, 'source_hash': 'f002e2a715be216987dd1b134e7b9fa6eef28e3caa82dead0109c4cdc489e089', 'file': 'no_ext_scan', 'context_start': None, 'context_end': None}], 'help': 'If the code appears to be valid, this may be a semgrep bug.'}]}}

Write you own Static Analysis tool

With libsast, you can write your own static analysis tools. libsast provides two matching engines:

  1. Pattern Matcher
  2. Semantic Grep

Pattern Matcher

Currently Pattern Matcher supports any language.

Use Regex 101 to write simple Python Regex rule patterns.

A sample rule looks like

  1. - id: test_regex_or
  2. message: This is a rule to test regex_or
  3. input_case: exact
  4. pattern:
  5. - MODE_WORLD_READABLE|Context\.MODE_WORLD_READABLE
  6. - openFileOutput\(\s*".+"\s*,\s*1\s*\)
  7. severity: error
  8. type: RegexOr
  9. metadata:
  10. owasp-web: a1
  11. reference: http://foo.bar
  12. foo: Some extra metadata

A rule consist of

  • id : A unique id for the rule.
  • message: A description for the rule.
  • input_case: It can be exact, upper or lower. Data will be converted to lower case/upper case/as it is before comparing with the regex.
  • pattern: List of patterns depends on type.
  • severity: It can be error, warning or info.
  • type: Pattern Matcher supports Regex, RegexAnd, RegexOr, RegexAndOr, RegexAndNot.
  • metadata (optional): Define your own custom fields that you can use as metadata along with standard mappings.
  1. 1. Regex - if regex1 in input
  2. 2. RegexAnd - if regex1 in input and regex2 in input
  3. 3. RegexOr - if regex1 in input or regex2 in input
  4. 4. RegexAndOr - if regex1 in input and (regex2 in input or regex3 in input)
  5. 5. RegexAndNot - if regex1 in input and not regex2 in input

Example: Pattern Matcher Rule

Test your pattern matcher rules

$ libsast -p tests/assets/rules/pattern_matcher/patterns.yaml tests/assets/files/

Inbuilt Standard Mapping Support

Metadata fields also support libsast standard mapping.

For example, the metadata field owasp-web: a1 will get expanded at runtime as owasp-web: 'A1: Injection'.

Currently Supports

Semantic Grep

Semantic Grep uses semgrep, a fast and syntax-aware semantic code pattern search for many languages: like grep but for code.

Currently it supports Python, Java, JavaScript, Go and C.

Use semgrep.dev to write semantic grep rule patterns.

A sample rule for Python code looks like

  1. rules:
  2. - id: boto-client-ip
  3. patterns:
  4. - pattern-inside: boto3.client(host="...")
  5. - pattern-regex: '\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
  6. message: "boto client using IP address"
  7. languages: [python]
  8. severity: ERROR
  9. metadata:
  10. owasp-web: a2
  11. owasp-mobile: m7
  12. cwe: cwe-1048
  13. foo: Some extra metadata

See semgrep documentation here.

Example: Semantic Grep Rule

Test your semgrep rules

$ libsast -s tests/assets/rules/semantic_grep/sgrep.yaml tests/assets/files/

Realworld Implementations

  • njsscan SAST is built with libsast pattern matcher and semantic grep.
  • nodejsscan nodejsscan is a static security code scanner for Node.js applications.
  • MobSF Static Code Analyzer for Android and iOS mobile applications.
  • mobsfscan mobsfscan is a static security code scanner for Mobile applications built for Android (Java, Kotlin) & iOS (Swift, Objective C).