{"id":34,"date":"2024-02-12T18:22:44","date_gmt":"2024-02-12T18:22:44","guid":{"rendered":"https:\/\/www.bowenv.com\/?p=34"},"modified":"2024-02-13T02:10:00","modified_gmt":"2024-02-13T02:10:00","slug":"regular-expression","status":"publish","type":"post","link":"https:\/\/www.bowenv.com\/index.php\/2024\/02\/12\/regular-expression\/","title":{"rendered":"Regular Expression"},"content":{"rendered":"<p><strong><em>Re for Texts Surrounded by {} with Outmost {}<\/em><\/strong><\/p>\n<pre><code class=\"language-python\">r&#039;\\{(?:[^{}]|(?R))*\\}&#039;<\/code><\/pre>\n<p>The expression <code>r&#039;\\{(?:[^{}]|(?R))*\\}&#039;<\/code> is a regular expression written in Python using the raw string notation (<code>r&#039;...&#039;<\/code>). Let's break down the components of this regular expression:<\/p>\n<ol>\n<li>\n<p><code>r&#039;<\/code>: The raw string notation in Python, indicating that backslashes <code>\\<\/code> are treated as literal characters and not as escape characters.<\/p>\n<\/li>\n<li>\n<p><code>\\{<\/code>: This matches the literal opening curly brace <code>{<\/code>. The backslash is used to escape the curly brace because <code>{<\/code> has a special meaning in regular expressions (quantifier for specifying repetition).<\/p>\n<\/li>\n<li>\n<p><code>(?: ... )<\/code>: This is a non-capturing group. It groups the enclosed patterns together without capturing the matched text. It's often used for grouping without creating a capture group.<\/p>\n<\/li>\n<li>\n<p><code>[^{}]<\/code>: This is a character class that matches any single character that is not a curly brace <code>{<\/code> or <code>}<\/code>. The <code>^<\/code> at the beginning of the character class negates it, meaning it matches any character except those specified.<\/p>\n<\/li>\n<li>\n<p><code>|<\/code>: This is the alternation operator, acting like a logical OR. It allows the regex to match either the pattern on the left or the pattern on the right.<\/p>\n<\/li>\n<li>\n<p><code>(?R)<\/code>: This is a recursive reference to the entire regular expression. It allows the pattern inside the non-capturing group to repeat itself recursively.<\/p>\n<\/li>\n<li>\n<p><code>*<\/code>: This is a quantifier that matches zero or more occurrences of the preceding pattern.<\/p>\n<\/li>\n<li>\n<p><code>\\}<\/code>: This matches the literal closing curly brace <code>}<\/code>.<\/p>\n<\/li>\n<\/ol>\n<p>Putting it all together, the entire regular expression <code>r&#039;\\{(?:[^{}]|(?R))*\\}&#039;<\/code> can be interpreted as follows:<\/p>\n<ul>\n<li><code>\\{<\/code>: Match the opening curly brace.<\/li>\n<li><code>(?:[^{}]|(?R))*<\/code>: Match any sequence of characters that is either not a curly brace or matches the entire pattern recursively.<\/li>\n<li><code>\\}<\/code>: Match the closing curly brace.<\/li>\n<\/ul>\n<p>In simpler terms, this regular expression is designed to match strings enclosed in curly braces, allowing for nested curly braces. It's a pattern commonly used in parsing nested structures like JSON or nested expressions in programming languages.<\/p>\n<p><strong><em>Re for Texts Surrounded by {} without {} in it<\/em><\/strong><\/p>\n<p><!-- ### Re for Texts Surrounded by \\{\\} with Balanced \\{\\} --><\/p>\n<pre><code class=\"language-python\">re.compile(r&#039;\\\\emph\\{([^{}]*(?:\\{[^{}]*\\}[^{}]*)*)\\}&#039;)<\/code><\/pre>\n<ul>\n<li>\n<p><code>re.compile<\/code>: This is a method in the <code>re<\/code> module that compiles a regular expression pattern into a regex object.<\/p>\n<\/li>\n<li>\n<p><code>r&#039;...&#039;<\/code>: The <code>r<\/code> prefix before the string denotes a raw string in Python. It ensures that backslashes are treated as literal characters and not as escape characters.<\/p>\n<\/li>\n<li>\n<p><code>\\\\emph\\{<\/code>: This part matches the literal string &quot;\\emph{&quot; in the text. The double backslashes are needed because a single backslash is an escape character in regex.<\/p>\n<\/li>\n<li>\n<p><code>([^{}]*(?:\\{[^{}]*\\}[^{}]*)*)<\/code>: This is the main capturing group that captures the content inside the <code>\\emph{}<\/code> environment.<\/p>\n<ul>\n<li>\n<p><code>([^{}]*<\/code>: This part captures any sequence of characters that are not curly braces.<\/p>\n<\/li>\n<li>\n<p><code>(?:\\{[^{}]*\\}[^{}]*)*<\/code>: This is a non-capturing group <code>(?: ... )<\/code> that allows repetition (<code>*<\/code>). It matches the pattern <code>\\{[^{}]*\\}[^{}]*<\/code>, which represents a pair of curly braces containing any characters except curly braces.<\/p>\n<\/li>\n<li>\n<p>The outer <code>(...)*<\/code> captures multiple occurrences of the non-capturing group, allowing for nested curly braces.<\/p>\n<\/li>\n<\/ul>\n<\/li>\n<li>\n<p><code>\\}<\/code>: This part matches the closing curly brace <code>}<\/code>.<\/p>\n<\/li>\n<\/ul>\n<p>So, in summary, this regular expression is designed to match and capture the content within <code>\\emph{...}<\/code> environments, handling nested curly braces within the emphasized text.\\<\/p>\n<p><strong><em>Non-Capturing Group<\/em><\/strong><\/p>\n<pre><code class=\"language-python\">re.compile(r&#039;\\\\emph\\{([^{}]*(?:\\{[^{}]*\\}[^{}]*)*)\\}&#039;)<\/code><\/pre>\n<ul>\n<li>\n<p><code>(?: ... )<\/code>: This is the syntax for a non-capturing group in a regular expression. It groups the enclosed pattern without creating a capture group for the matched result.<\/p>\n<\/li>\n<li>\n<p><code>\\{<\/code>: Matches the opening curly brace <code>{<\/code> literally.<\/p>\n<\/li>\n<li>\n<p><code>[^{}]*<\/code>: Matches any sequence of characters that are not curly braces. This ensures that the content inside the curly braces does not contain additional nested curly braces.<\/p>\n<\/li>\n<li>\n<p><code>\\}<\/code>: Matches the closing curly brace <code>}<\/code> literally.<\/p>\n<\/li>\n<li>\n<p><code>[^{}]*<\/code>: Matches any sequence of characters that are not curly braces. This allows for matching the text following the closing curly brace.<\/p>\n<\/li>\n<li>\n<p><code>*<\/code>: This quantifier applies to the entire non-capturing group <code>(?:\\{[^{}]*\\}[^{}]*)<\/code>, allowing for zero or more occurrences of the pattern it encapsulates. This accounts for the possibility of having nested curly braces within the emphasized text.<\/p>\n<\/li>\n<\/ul>\n<p>In summary, the non-capturing group is used to define a pattern for matching a pair of curly braces and the content within them, without creating a separate capture group for this specific part of the regex.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Re for Texts Surrounded by {} with Outmost {} r&#039;\\{(?:[^{}]|(?R))*\\}&#039; The expression r&#039;\\{(?:[^{}]|(?R))*\\}&#039; is a regular expression written in Python using the raw string notation (r&#039;&#8230;&#039;). Let&#8217;s break down the components of this regular expression: r&#039;: The raw string notation in Python, indicating that backslashes \\ are treated as literal characters and not as escape [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-34","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/posts\/34","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/comments?post=34"}],"version-history":[{"count":8,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/posts\/34\/revisions"}],"predecessor-version":[{"id":43,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/posts\/34\/revisions\/43"}],"wp:attachment":[{"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/media?parent=34"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/categories?post=34"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bowenv.com\/index.php\/wp-json\/wp\/v2\/tags?post=34"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}