Saturday, 14 September 2013

scala regexp stackoverflow

scala regexp stackoverflow

Typing this in scala (pattern matching with a regexp to find the value of
the id field
val str = """<path sodipodi:nodetypes="csszsscsscssssscssssscc"
inkscape:connector-curvature="0" id="basarbre" d="M 111.11111,111.11111 C
101.11111,111.1001 111.11111,111.11111 111.1011,101.01111
111.11111,111.1111 111.11111,110.11111 111.10111,111.11101
110.01111,111.11111 110.11111,111.11101 111.11111,111.01111
110.11111,111.1111 101.11111,111.10111 111.11111,111.11111
111.11111,101.11111 111.11111,111.11111 111.11111,111.11111
111.11111,111.11101 111.11111,101.11111 111.11111,101.11111
111.11111,101.11111 111.111,111.11101 101.01111,110.11111
111.11111,111.11111 101.1111,111.11111 101.11101,110.11111
111.10111,110.11101 101.11111,111.11111 101.11111,111.11111
101.11111,111.11111 111.11111,110.1111 111.10111,111.11111
111.11011,111.11111 111.11101,111.11111 111.01111,111.11111
110.11111,111.11111 111.11111,111.11111 110.01111,111.11111
111.11111,111.11111 111.11111,111.11111 111.01111,101.11111
111.11111,111.11101 110.11011,110.11111 101.11111,111.01111
11.111111,111.11111 11.111111,111.11111 11.111111,111.11111
11.111111,111.11111 11.111111,111.1111 10.111111,111.11111
11.111111,101.11111 11.010111,100.11111 11.111111,110.11111
11.111111,110.11111 11.111111,111.11111 11.111111,111.11111
11.010111,111.1111 11.101111,111.01111 11.11011,101.11111
-11.111111,110.11111 11.011111,111.11111 11.111111,111.10101
11.11111,111.11111 111.11101,111.01011 111.11101,111.01011 z"
style="fill:#511b00;fill-opacity:1;stroke:none"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:osb="http://www.openswatchbook.org/uri/2009/osb"/>"""
val Idpattern = """.*id="([^"]*)"(?:[\n\r\t]|.)*""".r
str match {
case Idpattern(id) => id
case _ => "no id"
}
Yields the following exception trace:
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BranchConn.match(Pattern.java:4466)
at java.util.regex.Pattern$CharProperty.match(Pattern.java:3694)
at java.util.regex.Pattern$Branch.match(Pattern.java:4502)
at java.util.regex.Pattern$GroupHead.match(Pattern.java:4556)
at java.util.regex.Pattern$Loop.match(Pattern.java:4683)
at java.util.regex.Pattern$GroupTail.match(Pattern.java:4615)
at java.util.regex.Pattern$BranchConn.match(Pattern.java:4466)
at java.util.regex.Pattern$CharProperty.match(Pattern.java:3694)
...
How can I overcome this problem? I could try parsing xml with a library
but I don't need something so obfuscated. I thought regexp could be fast
and reliable.

No comments:

Post a Comment