]> git.wincent.com - wikitext.git/log
wikitext.git
14 years agoRemove unnecessary string_from_str call in _Wikitext_parser_encode_link_target
Wincent Colaiuta [Mon, 11 May 2009 20:04:07 +0000 (22:04 +0200)] 
Remove unnecessary string_from_str call in _Wikitext_parser_encode_link_target

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMake link_target parameter a pointer to str in append hyperlink function
Wincent Colaiuta [Mon, 11 May 2009 19:59:36 +0000 (21:59 +0200)] 
Make link_target parameter a pointer to str in append hyperlink function

This change allows us to get rid of unnecessary Ruby String instance
instantiations in about 8 different places.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove unused str_new_size function
Wincent Colaiuta [Mon, 11 May 2009 19:26:44 +0000 (21:26 +0200)] 
Remove unused str_new_size function

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove unused str_swap function
Wincent Colaiuta [Mon, 11 May 2009 19:25:17 +0000 (21:25 +0200)] 
Remove unused str_swap function

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoReformat _Wikitext_utf8_to_utf32 for better readability
Wincent Colaiuta [Mon, 11 May 2009 19:11:12 +0000 (21:11 +0200)] 
Reformat _Wikitext_utf8_to_utf32 for better readability

Reduce line lengths to make the _Wikitext_utf8_to_utf32
function more readable, most notably by splitting lengthy
condition expressions and bitwise-OR expressions across
multiple lines.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAvoid copying string backing when returning from parse function
Wincent Colaiuta [Mon, 11 May 2009 19:00:20 +0000 (21:00 +0200)] 
Avoid copying string backing when returning from parse function

This is a somewhat nasty hack to avoid making a copy of the output
when it comes time to return from the function. For the time being
it will only work with Ruby 1.8.x, or at least, 1.9.x hasn't been
tested yet.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMake _Wikitext_parser_sanitize_link_target return void
Wincent Colaiuta [Mon, 11 May 2009 18:56:42 +0000 (20:56 +0200)] 
Make _Wikitext_parser_sanitize_link_target return void

Avoid the creation of another temporary Ruby String instance
by appending directly to a buffer. As part of this change the
_Wikitext_parser_sanitize_link_target function has been renamed
to _Wikitext_parser_append_sanitized_link_target.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRefactor _Wikitext_utf32_char_to_entity (append to buffer)
Wincent Colaiuta [Mon, 11 May 2009 17:23:02 +0000 (19:23 +0200)] 
Refactor _Wikitext_utf32_char_to_entity (append to buffer)

Rename the _Wikitext_utf32_char_to_entity function to
_Wikitext_append_entity_from_utf32_char, teaching it to
append to a target buffer directly rather than creating
a temporary Ruby String instance.

I don't particularly like these low-level manipulations but
the main goal here is to avoid the extra allocation; a
subsequent commit will clean up.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoCollapse conditional inside _Wikitext_parser_sanitize_link_target
Wincent Colaiuta [Mon, 11 May 2009 16:33:35 +0000 (18:33 +0200)] 
Collapse conditional inside _Wikitext_parser_sanitize_link_target

Combine two separate "else if" conditions into a single one.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove unnecessary allocation from _Wikitext_parser_sanitize_link_target
Wincent Colaiuta [Mon, 11 May 2009 16:25:50 +0000 (18:25 +0200)] 
Remove unnecessary allocation from _Wikitext_parser_sanitize_link_target

This temporary String object isn't required.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd sanity checks to parsing benchmark scripts
Wincent Colaiuta [Sun, 10 May 2009 23:44:11 +0000 (01:44 +0200)] 
Add sanity checks to parsing benchmark scripts

After the grand refactoring there are evidently still some lingering
low-level errors, because the benchmarking scripts are bailing with
an "overlong encoding" error after a certain period of time (full
output below).

I've added some sanity checks to the scripts to try and catch discrepancies
but so far none have been discovered.

Here is the full output of the run (this one for "parsing.rb", but the
results are similar for "profile_parsing.rb"):

Rehearsal -------------------------------------------------------------
short slab of ASCII text    1.800000   0.020000   1.820000 (  2.182344)
short slab of UTF-8 text    3.540000   0.030000   3.570000 (  4.127638)
longer slab of ASCII text  14.600000   0.140000  14.740000 ( 17.301072)
longer slab of UTF-8 text  46.150000   0.490000  46.640000 ( 58.118039)
--------------------------------------------------- total: 66.770000sec

user     system      total        real
short slab of ASCII text    1.800000   0.020000   1.820000 (  2.087143)
short slab of UTF-8 text    3.580000   0.040000   3.620000 (  4.315676)
longer slab of ASCII text  14.680000   0.160000  14.840000 ( 18.018380)
longer slab of UTF-8 text benchmarks/parsing.rb:321:in `parse': invalid
  encoding: overlong encoding (Wikitext::Parser::Error)
  from benchmarks/parsing.rb:321:in `parse'
  from benchmarks/parsing.rb:320:in `times'
  from benchmarks/parsing.rb:320:in `parse'
  from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/...
  from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/...
  from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/...
  from /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/...
  from benchmarks/parsing.rb:331

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove str_new_no_copy function
Wincent Colaiuta [Sun, 10 May 2009 16:42:18 +0000 (18:42 +0200)] 
Remove str_new_no_copy function

This function has no callers, so remove it.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoOverallocate for speed in str.c
Wincent Colaiuta [Sun, 10 May 2009 16:19:12 +0000 (18:19 +0200)] 
Overallocate for speed in str.c

One of the key motivations for switching to the str_t type internally
is that we can avoid allocations by re-using the same storage over and
over during the transformation.

We can avoid other allocations by overallocating when more storage
is requested, seeing as almost all requests for more storage will
later be followed by other requests.

At the moment, the original implementation is quite fast:

                                user     system      total        real
short slab of ASCII text    1.440000   0.000000   1.440000 (  1.445547)
short slab of UTF-8 text    2.900000   0.010000   2.910000 (  2.927274)
longer slab of ASCII text  12.710000   0.040000  12.750000 ( 12.816209)
longer slab of UTF-8 text  35.210000   0.080000  35.290000 ( 35.661577)

The new implementation is actually slower, because we have had to add
some wasteful conversions back-and-forth between VALUE/String and str_t:

short slab of ASCII text    1.550000   0.000000   1.550000 (  1.556956)
short slab of UTF-8 text    3.340000   0.010000   3.350000 (  3.377874)
longer slab of ASCII text  15.410000   0.030000  15.440000 ( 15.484308)
longer slab of UTF-8 text  45.230000   0.130000  45.360000 ( 45.631355)

It is expected that the performance loss will be recovered once these
wasteful conversions are eliminated.

But before going that far, adding overallocation brings very large
improvements, enough to compensate for the inefficient conversions:

short slab of ASCII text    1.190000   0.010000   1.200000 (  1.233443)
short slab of UTF-8 text    2.460000   0.020000   2.480000 (  2.536714)
longer slab of ASCII text  11.000000   0.050000  11.050000 ( 11.208843)
longer slab of UTF-8 text  33.490000   0.130000  33.620000 ( 34.190941)

Those numbers use an overallocation constant of 256 bytes; will later
experiment with other constants to find the optimal overallocation.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoChange 4 VALUE (String) members of the parser_t struct to str_t type
Wincent Colaiuta [Fri, 8 May 2009 15:44:01 +0000 (17:44 +0200)] 
Change 4 VALUE (String) members of the parser_t struct to str_t type

This is unfortuantely quite a large commit because the nature of
the change requires many parts to be modified at once; the
intermediate stages are not buildable and therefore not
bisectable.

Change the capture, output, link_target and link_text members of
the parser struct from VALUE (String) type to str_t. This should
improve performance because the str_t is faster and designed for
easy reuse so we can allocate a few instances at the beginning
of parsing and then use them repeatedly throughout the parse,
thus avoid many time-consuming allocations.

Remove the "capturing" member and instead use the "capture"
pointer as an indication of whether capturing is in progress.

Change the type of the "target" param in the
_Wikitext_pop_from_stack function (and the other "pop
from stack" functions) from VALUE (String) to pointer
to str_t.

Change the type of the "check_autolink" parameter to the
_Wikitext_append_hyperlink function from VALUE (boolean) to
bool.

Remove redundant passing in of parser->output to the
_Wikitext_pop_from_stack function.

Teach _Wikitext_blank to accept a pointer to a str_t struct
rather than a Ruby String (VALUE).

Add parser_new function to encapsulate the initial allocation
and initialization of the parser_t struct.

Rename str_append_rb_str function to str_append_string for
consistency with other functions in str.c.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoPartially revert "Remove GC_WRAP_STR and GC_WRAP_ARY macros"
Wincent Colaiuta [Fri, 8 May 2009 16:24:36 +0000 (18:24 +0200)] 
Partially revert "Remove GC_WRAP_STR and GC_WRAP_ARY macros"

Add back in the GC_WRAP_STR macro as I've now found a use for it
(specifically, inside functions like Wikitext_parser_sanitize_link_target).

This partially reverts commit ea1f3c0.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd str_append_rb_str function
Wincent Colaiuta [Fri, 8 May 2009 16:13:54 +0000 (18:13 +0200)] 
Add str_append_rb_str function

This function appends the passed-in Ruby String object to an
existing str_t instance.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove GC_WRAP_STR and GC_WRAP_ARY macros
Wincent Colaiuta [Fri, 8 May 2009 14:51:34 +0000 (16:51 +0200)] 
Remove GC_WRAP_STR and GC_WRAP_ARY macros

Now that the parser is taking care of Garbage Collection arrangements
these macros are no longer used or required.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMake parser struct participate in Ruby's Garbage Collection
Wincent Colaiuta [Fri, 8 May 2009 14:49:04 +0000 (16:49 +0200)] 
Make parser struct participate in Ruby's Garbage Collection

Instead of having individual str_t and ary_t members participate
in Ruby's mark-and-sweep Garbage Collection, put the parser
struct on the stack and make the parser participate; it will
be responsible for cleaning up its own member resources when
it falls out of scope.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd "capturing" member to parser struct
Wincent Colaiuta [Fri, 8 May 2009 14:08:03 +0000 (16:08 +0200)] 
Add "capturing" member to parser struct

This is preparation for the eventual move of some, perhaps all,
of the members which are currently of String (VALUE) type to the
faster, more easily reused str_t type.

With the VALUE type we can check whether a member is initialized
or in use by doing a NIL_P(member) test.

This is not possible with the str_t type as it is a struct rather
than a pointer (although admittedly, we will be using a pointer to
the struct rather than the struct itself).

We don't want to dispose of the struct and set the pointer to NULL
because the whole point of reusing the str_t structs is that we
can allocate them only once at the start of parsing and then
use them over and over.

Likewise we don't want to abuse the "len" member of the srt_t
struct (for example, setting it to -1 to flag that it is not
in use), because it is not exactly intuitive or self-evident.

Similarly, we don't want to add an additional struct member (a
boolean) to indicate whether the struct is in use or not. The
struct itself shouldn't have to know or care about this; this
should be the responsibility of the caller using the struct.

So for now we set up this "capturing" bool so that we can
track when the parser is in capturing mode. The intention is
that in a later commit the "capture" member will become a
str_t instance (or a pointer to one).

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoReorder parser struct members for better word alignment
Wincent Colaiuta [Fri, 8 May 2009 13:42:19 +0000 (15:42 +0200)] 
Reorder parser struct members for better word alignment

Keep long and pointer members together to guarantee alignment
on word boundaries.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse C99 _Bool type
Wincent Colaiuta [Fri, 8 May 2009 13:39:27 +0000 (15:39 +0200)] 
Use C99 _Bool type

Seeing as we already compile in C99 mode, we may as well make use
of the _Bool type defined by that standard. We also include the
system "stdbool.h" header so as to have access to the "bool", "true"
and "false" convenience macros.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse char type for boolean struct members
Wincent Colaiuta [Fri, 8 May 2009 13:29:41 +0000 (15:29 +0200)] 
Use char type for boolean struct members

It's wasteful to use a 32 (possibly even 64) bit integer to hold a
simple boolean value, so use the char type instead.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoDefine and use TRUE and FALSE macros
Wincent Colaiuta [Fri, 8 May 2009 13:28:24 +0000 (15:28 +0200)] 
Define and use TRUE and FALSE macros

This should be slightly more readable and less error-prone.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoConvert pending_crlf struct member to an int
Wincent Colaiuta [Fri, 8 May 2009 13:19:29 +0000 (15:19 +0200)] 
Convert pending_crlf struct member to an int

Rather than using a VALUE here use an int for simpler boolean tests.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove superfluous comments about autolinking
Wincent Colaiuta [Fri, 8 May 2009 13:12:20 +0000 (15:12 +0200)] 
Remove superfluous comments about autolinking

These comments don't really do anything to help the reader understand
the workings of the code; they are merely informative.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoConvert autolink struct member to an int
Wincent Colaiuta [Fri, 8 May 2009 13:11:35 +0000 (15:11 +0200)] 
Convert autolink struct member to an int

Rather than using a VALUE here use an int for simpler boolean tests.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoConvert space_to_underscore struct member to an int
Wincent Colaiuta [Fri, 8 May 2009 13:09:08 +0000 (15:09 +0200)] 
Convert space_to_underscore struct member to an int

Rather than using a VALUE here use an int for simpler boolean tests.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoDrop second param from _Wikitext_pop_all_from_stack
Wincent Colaiuta [Fri, 8 May 2009 12:54:28 +0000 (14:54 +0200)] 
Drop second param from _Wikitext_pop_all_from_stack

All three call sites of _Wikitext_pop_all_from_stack pass in
Qnil as the second parameter, so drop the param and just
hard-code Qnil.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse _Wikitext_pop_all_from_stack before returning output
Wincent Colaiuta [Fri, 8 May 2009 12:52:59 +0000 (14:52 +0200)] 
Use _Wikitext_pop_all_from_stack before returning output

Use the function instead of performing a manual for-loop.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoImprove efficiency of _Wikitext_pop_all_from_stack
Wincent Colaiuta [Fri, 8 May 2009 12:50:31 +0000 (14:50 +0200)] 
Improve efficiency of _Wikitext_pop_all_from_stack

Use a for-loop instead of repeatedly calling ary_entry inside
a while-loop. The simple integer comparison will be faster
than the function call. (And in any case, the
_Wikitext_pop_from_stack function which is called here will
do an ary_entry call anyway; so what's really happening here
with this change is that we call ary_entry once for each item
instead of twice.)

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoReuse link_target if link_text is Qnil in _Wikitext_append_hyperlink
Wincent Colaiuta [Fri, 8 May 2009 12:10:54 +0000 (14:10 +0200)] 
Reuse link_target if link_text is Qnil in _Wikitext_append_hyperlink

This cleans up a few call sites of _Wikitext_append_hyperlink. The
majority of these sites pass in the same text for the link target
and link text parameters, so teach the function to automatically
reuse the link target as the link text if no link text is explicitly
provided.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMinor clean-up in _Wikitext_rollback_failed_external_link
Wincent Colaiuta [Fri, 8 May 2009 12:00:31 +0000 (14:00 +0200)] 
Minor clean-up in _Wikitext_rollback_failed_external_link

Minor reorganization to make _Wikitext_rollback_failed_external_link a
little cleaner. Avoid the almost identical calls to
_Wikitext_append_hyperlink and instead set up a link_class local
variable.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoDon't apply "external" class when rolling back failed external links
Wincent Colaiuta [Fri, 8 May 2009 11:56:24 +0000 (13:56 +0200)] 
Don't apply "external" class when rolling back failed external links

We were incorrectly applying the "external" CSS class when rolling back
failed external links (eg. '[/hello this').

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoHandle blank link text
Wincent Colaiuta [Fri, 8 May 2009 11:41:31 +0000 (13:41 +0200)] 
Handle blank link text

We already correctly handled zero-width link text (eg. [[foo|]]),
but did not check for link text which was non-zero-width but
blank (eg. [[foo| ]]).

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRefactor _Wikitext_rollback_failed_link function and friends
Wincent Colaiuta [Fri, 8 May 2009 11:31:01 +0000 (13:31 +0200)] 
Refactor _Wikitext_rollback_failed_link function and friends

The _Wikitext_rollback_failed_link function now encapsulates the
common pattern of trying to roll back failed internal and external
links in a single function call.

On those occasions when we want to roll back only one type of link
we must instead use the _Wikitext_rollback_failed_internal_link or
_Wikitext_rollback_failed_external_link functions.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRedefine TOKEN_TEXT in terms of TOKEN_LEN
Wincent Colaiuta [Fri, 8 May 2009 10:53:03 +0000 (12:53 +0200)] 
Redefine TOKEN_TEXT in terms of TOKEN_LEN

Rather than having both macros calculate the token length, do
it in one macro only and re-use that in the other one.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRename _Wikitext_downcase to _Wikitext_downcase_bang
Wincent Colaiuta [Fri, 8 May 2009 08:43:51 +0000 (10:43 +0200)] 
Rename _Wikitext_downcase to _Wikitext_downcase_bang

This is a destructive function which overwrites the original
contents of the string, so indicate that in the function name
itself.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoTeach _Wikitext_append_hyperlink to check the autolink setting
Wincent Colaiuta [Thu, 7 May 2009 23:03:44 +0000 (01:03 +0200)] 
Teach _Wikitext_append_hyperlink to check the autolink setting

Rather than checking the autolink setting in the numerous sites
where _Wikitext_append_hyperlink is called, move the check into
the function itself, and pass a flag in specifying whether to
perform the check.

The overall saving here is about 8 lines thanks to the eliminated
repetition.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRename _Wikitext_hyperlink to _Wikitext_append_hyperlink
Wincent Colaiuta [Thu, 7 May 2009 22:55:34 +0000 (00:55 +0200)] 
Rename _Wikitext_hyperlink to _Wikitext_append_hyperlink

This more descriptive function name matches the new behaviour
of the method, indicating that it appends the hyperlink rather
than just returning it.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove temporary string variable from _Wikitext_hyperlink
Wincent Colaiuta [Thu, 7 May 2009 22:52:05 +0000 (00:52 +0200)] 
Remove temporary string variable from _Wikitext_hyperlink

Now that _Wikitext_hyperlink returns void there is no longer any
need to use a temporary String instance. Instead, we append
directly to the parser->output buffer, thus saving an allocation.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMake _Wikitext_hyperlink return void
Wincent Colaiuta [Thu, 7 May 2009 22:43:13 +0000 (00:43 +0200)] 
Make _Wikitext_hyperlink return void

Now that _Wikitext_hyperlink automatically appends its output there
is no need for it to return anything explicitly.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAutomatically append link in _Wikitext_hyperlink
Wincent Colaiuta [Thu, 7 May 2009 22:39:18 +0000 (00:39 +0200)] 
Automatically append link in _Wikitext_hyperlink

All call sites of _Wikitext_hyperlink take the return value and
append it to the the parser->output buffer.

Avoid this repetition by performing the output automatically
from within the function itself so that the callers don't
have to do it.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd comment justifying the scope_includes_space variable
Wincent Colaiuta [Thu, 7 May 2009 22:30:54 +0000 (00:30 +0200)] 
Add comment justifying the scope_includes_space variable

This comment serves as a reminder for why this variable exists
(to remember what was on the stack prior to popping); without
it the reader might ask "why do we have a temporary variable
here which is only used once?".

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRemove stale comment
Wincent Colaiuta [Thu, 7 May 2009 15:57:32 +0000 (17:57 +0200)] 
Remove stale comment

This comment is a left-over from the distant past when many of
these functions were explicitly marked as inline functions,
rather than letting the compiler decided when to inline.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate README with donation instructions
Wincent Colaiuta [Thu, 7 May 2009 10:22:43 +0000 (12:22 +0200)] 
Update README with donation instructions

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate link to issue tracker
Wincent Colaiuta [Thu, 7 May 2009 10:19:30 +0000 (12:19 +0200)] 
Update link to issue tracker

Although the old URL still works, update to the official URL
anyway.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Thu, 7 May 2009 09:49:47 +0000 (11:49 +0200)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number for 1.6 release 1.6
Wincent Colaiuta [Thu, 7 May 2009 09:48:21 +0000 (11:48 +0200)] 
Bump version number for 1.6 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate release notes for 1.6 release
Wincent Colaiuta [Thu, 7 May 2009 09:47:52 +0000 (11:47 +0200)] 
Update release notes for 1.6 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse absolute paths in internal "requires"
Wincent Colaiuta [Thu, 7 May 2009 09:35:57 +0000 (11:35 +0200)] 
Use absolute paths in internal "requires"

Ensure that when locally testing or otherwise using a specific
version of the extension that the files included using "require"
come from the same version and not from some other version in the
load path.

For example, prior to this commit, doing an:

  irb -r ext/wikitext lib/wikitext/string

Would not produce the desired result. First the local copy of the
extension would be loaded, then the local "lib/wikitext/string",
but then the latter would do a "require 'wikitext/parser'", which
would load the first corresponding file in the load path (usually
the latest installed gem), which would in turn do a "require
'wikitext'" and end up loading the first corresponding file in
the load path.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoSpecify ":indent => false" default in wikitext/string extension
Wincent Colaiuta [Wed, 6 May 2009 23:03:58 +0000 (01:03 +0200)] 
Specify ":indent => false" default in wikitext/string extension

Seeing as the String extension is primarily for use in Rails
applications, where setting up Haml to run with "ugly" mode turned
on is a good idea, it makes sense to make the "w" and "to_wikitext"
methods on the String class pass in ":indent => false" by default.

This can be overridden if desired by passing in an explicit indent
such as ":indent => 0".

See:

  https://wincent.com/issues/817

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAccept "indent" value of "false" to disable indentation entirely
Wincent Colaiuta [Wed, 6 May 2009 22:54:26 +0000 (00:54 +0200)] 
Accept "indent" value of "false" to disable indentation entirely

This produces slightly more compact HTML output and is intended to
match the output of Haml's "ugly" mode.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd support for "absolute" image "src" attributes
Wincent Colaiuta [Wed, 6 May 2009 22:02:46 +0000 (00:02 +0200)] 
Add support for "absolute" image "src" attributes

Under the default settings input like "{{foo.png}}" will be translated
to an "img" tag with "src" attribute of "/images/foo.png".

With this commit, we provide a shortcut for breaking out of the "prefix"
directory: any target beginning with a slash will be interpreted as
"absolute". So input like "{{/foo.png}}" will yield a "src" of
"/foo.png".

This is a somewhat prettier mechanism than using inputs like
"{{../foo.png}}", which already worked under the current translator.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse C99 comment style for consistency
Wincent Colaiuta [Wed, 6 May 2009 21:56:16 +0000 (23:56 +0200)] 
Use C99 comment style for consistency

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoPass through zero-length image targets unchanged
Wincent Colaiuta [Wed, 6 May 2009 21:54:12 +0000 (23:54 +0200)] 
Pass through zero-length image targets unchanged

That is, "{{}}" should not be turned into an "img" tag.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMinor corrections and clarifications to README file
Wincent Colaiuta [Wed, 6 May 2009 21:38:14 +0000 (23:38 +0200)] 
Minor corrections and clarifications to README file

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Tue, 28 Apr 2009 10:33:24 +0000 (12:33 +0200)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate version number and release notes for 1.5.3 release 1.5.3
Wincent Colaiuta [Tue, 28 Apr 2009 10:31:57 +0000 (12:31 +0200)] 
Update version number and release notes for 1.5.3 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoMerge branch 'maint'
Wincent Colaiuta [Tue, 28 Apr 2009 10:22:38 +0000 (12:22 +0200)] 
Merge branch 'maint'

14 years agoHandle empty (whitespace-only) link targets
Wincent Colaiuta [Tue, 28 Apr 2009 10:11:57 +0000 (12:11 +0200)] 
Handle empty (whitespace-only) link targets

Previously we accepted buggy input like "[[ ]]", "[[  ]]"
and "[[   |foo]]" and dutifully turned it into empty links
like:

  <a href="/wiki/"></a>

And:

  <a href="/wiki/">foo</a>

It clearly makes no sense for a bad link like "[[  ]]" to
take the user back to the wiki index, so now we spit out
these bad links verbatim to provide feedback to the user
that there's something wrong with their input.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoHandle empty (zero-width) link targets
Wincent Colaiuta [Tue, 28 Apr 2009 09:34:41 +0000 (11:34 +0200)] 
Handle empty (zero-width) link targets

Prior to this empty link targets like "[[]]", "[[|]]" and
"[[|foo]]" were enough to derail the parser because they
caused an exception to be raised.

See:

  https://wincent.com/issues/1289

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate release notes
Wincent Colaiuta [Wed, 15 Apr 2009 23:16:04 +0000 (01:16 +0200)] 
Update release notes

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoGit: ignore ".specification" file
Wincent Colaiuta [Wed, 15 Apr 2009 22:52:52 +0000 (00:52 +0200)] 
Git: ignore ".specification" file

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd vim_fomatter script
Wincent Colaiuta [Wed, 15 Apr 2009 22:50:29 +0000 (00:50 +0200)] 
Add vim_fomatter script

This helps when running specs from inside Vim. For more info
on the Vim side of the integration see:

  https://wincent.com/blog/running-rspec-specs-from-inside-vim

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoSwitch from GPL to 2-clause BSD
Wincent Colaiuta [Wed, 15 Apr 2009 10:34:41 +0000 (12:34 +0200)] 
Switch from GPL to 2-clause BSD

I want a more permissive license, and the "Simplified"
or "2-clause" variant of the BSD license (as used by
FreeBSD) looks to be the best choice.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoSilly Markdown benchmark
Wincent Colaiuta [Tue, 7 Apr 2009 13:35:27 +0000 (15:35 +0200)] 
Silly Markdown benchmark

How fast can we churn through the Markdown README
file concatenated 32 times? It's not even wikitext
markup, but we want to test throughput.

See:

  https://wincent.com/blog/markdown-sucks

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Fri, 27 Mar 2009 13:04:52 +0000 (14:04 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number for 1.5.2 release 1.5.2
Wincent Colaiuta [Fri, 27 Mar 2009 13:03:30 +0000 (14:03 +0100)] 
Bump version number for 1.5.2 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoExtend e7f856d to handle nested shorthand
Wincent Colaiuta [Fri, 27 Mar 2009 12:57:54 +0000 (13:57 +0100)] 
Extend e7f856d to handle nested shorthand

The fix in e7f856d handles non-nested ">" BLOCKQUOTE
shorthand. In this commit we extend the fix to also
cope with nested shortand (">>", ">>>" etc).

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoFix PRE_START and BLOCKQUOTE_START following shorthand
Wincent Colaiuta [Fri, 27 Mar 2009 12:42:49 +0000 (13:42 +0100)] 
Fix PRE_START and BLOCKQUOTE_START following shorthand

This adds a special case for PRE_START and
BLOCKQUOTE_START tokens which immediately follow
PRE and BLOCKQUOTE shorthand lines.

Basically, if they appear in the first column
then they are valid, because they close the old
(shorthand) blocks and open a new block. Anywhere
other than the first column they are considered
illegal.

See:
  https://wincent.com/issues/818

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoSwitch to Ragel 6.4
Wincent Colaiuta [Fri, 27 Mar 2009 12:08:01 +0000 (13:08 +0100)] 
Switch to Ragel 6.4

I've updated from Ragel 6.1 to Ragel 6.4 on my local development
machine, so the generated parser file has a few differences now.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoWork around Rails bug #2266
Wincent Colaiuta [Fri, 27 Mar 2009 11:12:46 +0000 (12:12 +0100)] 
Work around Rails bug #2266

See:
  https://rails.lighthouseapp.com/projects/8994/tickets/2266

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoConsistently apply "mailto" class to mailto URIs
Wincent Colaiuta [Fri, 27 Mar 2009 11:03:35 +0000 (12:03 +0100)] 
Consistently apply "mailto" class to mailto URIs

Implement more consistent CSS styling given mailto URIs.
The following inputs previously received "external" CSS
styling because mailto URIs are tokenized as URIs rather
than MAIL tokens:

  mailto:user@example.com
  [mailto:user@example.com email me]

Only raw email addresses were correctly styled:

  user@example.com

Now all of the above cases consistently use the "mailto"
class.

See:
  https://wincent.com/issues/1262

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Tue, 17 Mar 2009 09:51:55 +0000 (10:51 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number for 1.5.1 release 1.5.1
Wincent Colaiuta [Tue, 17 Mar 2009 09:44:08 +0000 (10:44 +0100)] 
Bump version number for 1.5.1 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAccept options hash in "to_wikitext" and "w" methods
Wincent Colaiuta [Mon, 23 Feb 2009 21:39:01 +0000 (22:39 +0100)] 
Accept options hash in "to_wikitext" and "w" methods

Seeing as "w" is the most frequently-used means of translating
wikitext in the context of a web application, it makes sense to
provide a means of passing in an optional options hash so that
overrides can be conveniently fed into the parser on demand.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Mon, 23 Feb 2009 11:33:38 +0000 (12:33 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number prior to 1.5.0 release 1.5.0
Wincent Colaiuta [Mon, 23 Feb 2009 11:31:16 +0000 (12:31 +0100)] 
Bump version number prior to 1.5.0 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd "base_heading_level" option
Wincent Colaiuta [Mon, 23 Feb 2009 11:28:52 +0000 (12:28 +0100)] 
Add "base_heading_level" option

An integer between 0 and 6 denoting the current "heading level".
This can be used to inform the parser of the "context" in which
it is translating markup.

For example, the parser might be translating blog post excerpts
on a page where there is an "h1" title element for the page itself
and an "h2" title element for each excerpt. In this context it is
useful to set base_heading_level to 2, so that any "top level"
headings in the markup (that is "h1" elements) can be automatically
transformed into "h3" elements so that they appear to be
appropriately "nested" inside the containing page elements.

In this way, markup authors can be freed from thinking about
which header size they should use and just always start from "h1"
for their most general content and work their way down.

An additional benefit is that markup can be used in different
contexts at different levels of nesting and the headings will be
adjusted to suit automatically with no intervention from the
markup author.

Finally, it's worth noting that in contexts where the user input
is not necessarily trusted, this setting can be used to prevent
users from inappropriately employing "h1" tags in deeply-nested
contexts where they would otherwise disturb the visual harmony of
the page.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRuby 1.9: add magic comments to indicate source file encoding
Wincent Colaiuta [Fri, 13 Feb 2009 17:28:33 +0000 (18:28 +0100)] 
Ruby 1.9: add magic comments to indicate source file encoding

Lots of these spec files are actually in UTF-8. Ruby 1.8.6
just slurped them in as byte streams so they just worked, but
1.9.1 expects them all to be ASCII and complains when they are
not.

We add these "magic" comments so that Ruby 1.9 knows that the
files are UTF-8-encoded and won't complain.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoRuby 1.9: use String#each_line instead of #each
Wincent Colaiuta [Fri, 13 Feb 2009 17:27:31 +0000 (18:27 +0100)] 
Ruby 1.9: use String#each_line instead of #each

String#each isn't implemented in 1.9, so switch to #each_line,
which should work in both 1.8.6 and 1.9.1.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Tue, 3 Feb 2009 23:23:36 +0000 (00:23 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate release notes to coincide with 1.4.1 release
Wincent Colaiuta [Tue, 3 Feb 2009 23:21:06 +0000 (00:21 +0100)] 
Update release notes to coincide with 1.4.1 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoFix subtle Rakefile breakage
Wincent Colaiuta [Tue, 3 Feb 2009 22:48:00 +0000 (23:48 +0100)] 
Fix subtle Rakefile breakage

As mentioned in b12cb25, the 1.4.0 release was broken due to a misfire
during "rake package".

The diagnosis was as follows; before each of the following tests, a
"rake clobber" was executed to start with a clean slate:

   - "rake gem": "make" does appear to be executed,
     but produced gem is incomplete
   - "rake make; rake gem": produced gem is complete
   - "rake package": "make" does appear to be executed,
     but produced gem is incomplete
   - "rake make; rake package": you can see "clobber"
     task getting executed, then "make"; you would
     expect this automated "make" not to work seeing
     as it is broken in two other cases above, but
     the built gem is complete and it works!

Specifically, in the incomplete gems, the generated
extension Makefile has these lines:

  SRCS = ary.c parser.c str.c token.c wikitext.c
  OBJS = ary.o parser.o str.o token.o wikitext.o

Instead of the desired:

  SRCS = ary.c parser.c str.c token.c wikitext.c wikitext_ragel.c
  OBJS = ary.o parser.o str.o token.o wikitext.o wikitext_ragel.o

Note that the Makefile in the sourcetree in all of these tests
was correct; it was only the file generated from the gems that
was incorrect.

Further inspection showed that the wikitext_ragel.c file was not
included in the gem. This was the key; the file was missing
from inside the gem even though it was correctly generated outside
while building the gem. Naturally, the Makefile was also missing the
file ("mkmf" generates the SRCS and OBJS lines based on what it
finds in the directory).

The reason the file was missing was that the "files" entry in the
Gem specification was evaluated too early, before the dependencies
in the tasks had been executed, and before the "wikitext_ragel.c"
file even existed.

Running "rake make" manually beforehand was enough to make
everything work because it ensured that the "wikitext_ragel.c"
file was present at the time the Gem specification was evaluated;
even though the file was later clobbered and rebuilt it didn't
matter because the "files" entry was correct.

It would have been nice for the gem build to fail due to the
missing file. I would have been alerted then to the fact that
1.4.0 was broken; I'm going to see if I can make the build process
barf in the face of a missing file like that.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number prior to 1.4.1 release 1.4.1
Wincent Colaiuta [Tue, 3 Feb 2009 21:49:34 +0000 (22:49 +0100)] 
Bump version number prior to 1.4.1 release

This version is basically identical to 1.4.0 but the 1.4.0 gem that was
published on RubyForge was faulty due to an as-yet unexplained bug in
Rake or the Rakefile.

The gem was made using "rake package", which produces a gem which
passes all the specs but which is missing the required object file,
wikitext_ragel.o. "rake clobber; rake package" also suffers from the
missing object file.

"rake clobber; rake make; rake package", on the other hand, does
include the required file.

So for now I want to get a new release out to replace the broken gem
on RubyForge; will investigate the Rake/Rakefile bug subsequently.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Mon, 2 Feb 2009 21:22:29 +0000 (22:22 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd release notes to RDoc documentation 1.4.0
Wincent Colaiuta [Sun, 1 Feb 2009 21:39:18 +0000 (22:39 +0100)] 
Add release notes to RDoc documentation

Seeing as 1.4.0 introduces a backwards-incompatible change, include
release notes in the RDoc documentation.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number prior to 1.4.0 release
Wincent Colaiuta [Sun, 1 Feb 2009 21:28:29 +0000 (22:28 +0100)] 
Bump version number prior to 1.4.0 release

As this release involves a compatiblity-breaking change
(the replacement of "special" links in internal link spans
with "path" links in external link spans) bumping the release
number up to 1.4.0.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUpdate string additions to use new external link syntax
Wincent Colaiuta [Sun, 1 Feb 2009 21:15:49 +0000 (22:15 +0100)] 
Update string additions to use new external link syntax

This brings the string additions into sync with the new link
syntax introduced in 7491ebc.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoCreate "special links" from external rather than internal links
Wincent Colaiuta [Sun, 1 Feb 2009 21:03:47 +0000 (22:03 +0100)] 
Create "special links" from external rather than internal links

It was a clumsy design decision to offer detection of "special links"
inside internal links:

  [[/issues/20 | ticket #20]]

A much nicer syntax is to use external links instead:

  [/issues/20 ticket #20]

In this commit I rip out the old implementation as well as a
supporting instance variable (treat_slash_as_special), and a
couple of related low-level struct members used at the C
level (treat_slash_as_special and special_link) which were
themselves all probably evidence of "design smell".

In place of the old feature we move handling of "path"-style
links to inside of external links. Specs and docs are updated
accordingly.

See also:

  https://rails.wincent.com/issues/1208

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoFix minor error in parser documentation
Wincent Colaiuta [Wed, 7 Jan 2009 00:32:03 +0000 (01:32 +0100)] 
Fix minor error in parser documentation

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoFix minor errors in README
Wincent Colaiuta [Wed, 7 Jan 2009 00:29:35 +0000 (01:29 +0100)] 
Fix minor errors in README

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd license and feedback sections to readme
Wincent Colaiuta [Wed, 7 Jan 2009 00:26:39 +0000 (01:26 +0100)] 
Add license and feedback sections to readme

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number post-release
Wincent Colaiuta [Tue, 6 Jan 2009 01:12:03 +0000 (02:12 +0100)] 
Bump version number post-release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoBump version number for 1.3.2 release 1.3.2
Wincent Colaiuta [Tue, 6 Jan 2009 01:08:08 +0000 (02:08 +0100)] 
Bump version number for 1.3.2 release

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoToggle default value of space_to_underscore
Wincent Colaiuta [Tue, 6 Jan 2009 00:55:58 +0000 (01:55 +0100)] 
Toggle default value of space_to_underscore

I think the original choice for this was overly conservative, despite
the reasoning mentioned in the documentation:

  Converting spaces to underscores makes most URLs prettier, but it
  comes at a cost: when this mode is true the articles "foo bar" and
  "foo_bar" can no longer be disambiguated, and a link to "foo_bar"
  will actually resolve to "foo bar"; it is therefore recommended
  that you explicitly disallow underscores in titles at the
  application level so as to avoid this kind of confusion.

In reality, basically everyone using the wikitext module to power an
online wiki (that is, all of the users?) will want neat underscores
in their URLs rather than ugly "%20" escapes.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoUse shared parser from Wikitext::Parser rather than String
Wincent Colaiuta [Tue, 6 Jan 2009 00:40:23 +0000 (01:40 +0100)] 
Use shared parser from Wikitext::Parser rather than String

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd "shared parser" class method to Wikitext::Parser
Wincent Colaiuta [Tue, 6 Jan 2009 00:28:43 +0000 (01:28 +0100)] 
Add "shared parser" class method to Wikitext::Parser

This is probably a better place to keep the optional singleton instance
than in the String extension where it currently is.

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd specs for version module
Wincent Colaiuta [Tue, 6 Jan 2009 00:19:03 +0000 (01:19 +0100)] 
Add specs for version module

Signed-off-by: Wincent Colaiuta <win@wincent.com>
14 years agoAdd specs for NilClass extensions
Wincent Colaiuta [Tue, 6 Jan 2009 00:15:18 +0000 (01:15 +0100)] 
Add specs for NilClass extensions

Signed-off-by: Wincent Colaiuta <win@wincent.com>