On Sep 19, 2008, at 2:05 PM, Sergiu Dumitriu wrote:
Vincent Massol wrote:
Hi,
I'd like to propose to change the escape character. Right now it's \
The problem is that if you want to enter a \, you'll want to write \\
but that's reserved for a new line...
Hence and in order to be more Creole compatible I propose to replace
our current \ escape character by a ~
See
http://www.wikicreole.org/wiki/Creole1.0#section-Creole1.0-EscapeCharacter
Here's my +1
Will it be easy and simple to enter a simple ~? Like in URLs, or
inside
normal text, I'd like to be able to just write a ~ and the renderer
should know that it is not supposed to escape anything.
This would be the definition (as described in the creole link I
mentioned):
<#ESCAPE: ( "~" ~[" ", "\t", "\n",
"\r"] ) >
The reason to choose ~ is because it's not a character you use often
(certainly less often than \ IMO).
From the creole link:
"
Rationale: If one needs keyboard characters often in a text, there
would be too many distracting triple curly braces to be able to work
with the text well. Therefore an escape character would help to keep
people from being so distracted by the nowiki inline and escape
character could be used instead. The tilde was chosen, so it would not
conflict with the backslashes in line breaks and because it is a
relatively infrequently used character. It is not generally easy to
type, but it will also not need to be used often, so in this sense it
is also suitable. This way, stars, slashes and other markup
characters, when found in the original text, can be easily escaped to
be rendered as themselves.
"
So if you need to enter ~ in some text you'll need to enter ~~
Does Wikimodel and Doxia support this easily?
For URLs here's what wikimodel recognizes right now:
//
=
========================================================================
// URI syntax recognition.
//
=
========================================================================
// This grammar recognize the full URI syntax with following
exceptions:
// * It has a simplified hier-part definition: it does not
contain an empty
// path (so the sequences like "here: " are not recognized as
URIs).
// * It has a simplified version of the host definition: it does
not contain
// explicit IP definitions.
// * It parses "extended" URI syntax where "opaque" URIs are
treated as
// having multiple schema parts
// Example: in an opaque URI like "download:http://www.foo.com/bar.zip
"
// the part "download:http" is treated as a "composite" scheme
part.
//
// See also:
// *
http://tools.ietf.org/html/rfc3986#page-49 - the official
URI grammar
// *
http://en.wikipedia.org/wiki/Uniform_Resource_Identifier
// *
http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax
// *
http://www.iana.org/assignments/uri-schemes.html
//
=
========================================================================
| <#URI: <URI_SCHEME_COMPOSITE> ":" <URI_HIER_PART>
("?"
<URI_QUERY>)? ("#" <URI_FRAGMENT>)? >
| <#ALPHA: ( ["A"-"Z", "a"-"z"] )>
| <#DIGIT: ["0"-"9"]>
| <#HEXDIG: ( <DIGIT> | ["A"-"F"] |
["a"-"f"] ) >
| <#URI_GEN_DELIMS: [ ":", "/", "?", "#",
"[", "]", "@" ]>
// Some default can not be accepted in the text - like "," symbols
//<#URI_SUB_DELIMS: [ "!", "$", "&",
"'", "(", ")", "*", "+",
",", ";", "=" ]>
| <#URI_SUB_DELIMS: [ "!", "$", "&",
"'", "(", ")", "*", "+", /
*",",*/ ";", "=" ]>
| <#URI_UNRESERVED: ( <ALPHA> | <DIGIT> | "-" |
"." | "_" | "~" )>
| <#URI_RESERVED: ( <URI_GEN_DELIMS> | <URI_SUB_DELIMS> ) >
| <#URI_SCHEME: <ALPHA> ( <ALPHA> | <DIGIT> | "+" |
"-" | "." )* >
| <#URI_SCHEME_COMPOSITE: <URI_SCHEME> ( ":" <URI_SCHEME> )*
>
| <#URI_PCT_ENCODED: "%" <HEXDIG> <HEXDIG> >
| <#URI_PCHAR_FIRST: ( <URI_UNRESERVED> | <URI_PCT_ENCODED> |
<URI_SUB_DELIMS> ) >
| <#URI_PCHAR: ( <URI_PCHAR_FIRST> | ":" | "@" ) >
| <#URI_QUERY: ( <URI_PCHAR> | "/" | "?" )* >
| <#URI_FRAGMENT: ( <URI_PCHAR> | "/" | "?" )* >
// A simplified hier-part definition: it does not contain an
empty path.
| <#URI_HIER_PART: ( "//" <URI_AUTHORITY> <URI_PATH_ABEMPTY>
|
<URI_PATH_ABSOLUTE> | <URI_PATH_ROOTLESS> )>
| <#URI_AUTHORITY: ( <URI_USERINFO> "@" )? <URI_HOST> (
":"
<URI_PORT> )? >
| <#URI_USERINFO: ( <URI_UNRESERVED> | <URI_PCT_ENCODED> |
<URI_SUB_DELIMS> | ":" )* >
| <#URI_PATH_ABEMPTY: ( "/" <URI_SEGMENT> )* >
| <#URI_PATH_ABSOLUTE: "/" ( <URI_SEGMENT_NZ> ( "/"
<URI_SEGMENT> )* )? >
| <#URI_PATH_ROOTLESS: <URI_PCHAR_FIRST> <URI_SEGMENT_NZ_NC>
( "/" <URI_SEGMENT> )* >
| <#URI_SEGMENT: (<URI_PCHAR>)* >
| <#URI_SEGMENT_NZ: (<URI_PCHAR>)+ >
| <#URI_SEGMENT_NZ_NC: (<URI_UNRESERVED> | <URI_PCT_ENCODED> |
<URI_SUB_DELIMS> | "@")+ >
| <#URI_PORT: (<DIGIT>)+ >
// A simplified version of the host: it does not contain
explicit IP definitions
| <#URI_HOST: ( <URI_REG_NAME> ) >
| <#URI_REG_NAME: ( <URI_UNRESERVED> | <URI_PCT_ENCODED> |
<URI_SUB_DELIMS> )* >
//
=
========================================================================
In any case since we've decided to be as compatible as possible to
Creole I think it makes sense.
Thanks
-Vincent