Advanced regex tips I have learned recently

Before we get started 

* matches the previous token between zero and unlimited times, as many times as possible. (greedy)

*? matches the previous token between zero and unlimited times, as few times as possible, expanding as needed (lazy)

https://www.rexegg.com/regex-conditionals.html

https://stackoverflow.com/a/6308451/8667243

How to match strings between two words

In one line

word1(.*)word2

In multiple lines

word1((\s|.)*?)word2

Examples

Match python comments

(?P<documentation>(?:(?:\s+[\"\']{3}(?:(?:\s|.)*?)[\"|\']{3}\n+)?(?:[ \t]*?\#(?:.*?)\n+)*)*)?

Match python class and function and comments

# For code blocks
(?P<code_block>(?:[ \t]*)(?P<code_head>(?:(?:(?:@(?:.*)\s+)*)*(?:(?:class)|(?:(?:async\s+)*def)))[ \t]*(?:\w+)\s*\((?:.*?)\)(?:[ \t]*->[ \t]*(?:(.*)*))?:\n+)(?P<code_body>(?:(?:)(?:[ \t]+[^\n]*)|\n)+))

or

# For all head information
(?P<class_or_function_top_defination>(?: *@(?:.*?)\n+)* *(?:\s+(?P<is_class>class)|(?P<is_function>def|async +def)) +(?:(?:\n|.)*?):\n+)(?P<documentation>(?:(?:\s+[\"\']{3}(?:(?:\s|.)*?)[\"|\']{3}\n+)?(?:[ \t]*?\#(?:.*?)\n+)*)*)?(?P<class_or_function_propertys>(?(is_class)((?![ \t]+(?:def|class) )(?:(?:.*?): *(?:.*?) *= *(?:.*?)\n)*)|(?:)))?

Author

yingshaoxo@gmail.com