Issue 6493072: Allow digits in identifiers

Issue 6493072: Allow digits in identifiers (Closed)

Can't Edit
Can't Publish+Mail
Start Review

Created:
11 years, 8 months ago by Keith

Modified:
10 years, 11 months ago

Reviewers:
bernard, dak, lemzwerg, janek

CC:
lilypond-devel_gnu.org

Visibility:
Public.

Description

Allow digits in identifiers Set lexer state to INITIAL for top-level expressions, switching to 'notes' mode inside music-expressions

Patch Set 1 : version allowing numbers at ends #

Total comments: 21

Patch Set 2 : version with digits, but not at the end #

Total comments: 1

Patch Set 3 : Use a plus character to introduce digits #

Patch Set 4 : Use a dot character to introduce digits #

Created: 11 years, 6 months ago

Download [raw] [tar.bz2]

	Unified diffs	Side-by-side diffs	Delta from patch set	Stats (+27 lines, -13 lines)			Patch
M	Documentation/learning/common-notation.itely	View	1 2 3	1 chunk	+5 lines, -0 lines	0 comments	Download
A	input/regression/identifiers-with-digits.ly	View	1 2 3	1 chunk	+8 lines, -0 lines	0 comments	Download
M	lily/lexer.ll	View	1 2 3	8 chunks	+12 lines, -12 lines	0 comments	Download
M	vim/lilypond-syntax.vim	View	1 2 3	2 chunks	+2 lines, -1 line	0 comments	Download

Messages

Total messages: 16

Expand All Messages | Collapse All Messages

dak

All in all, a large step backwards for making the lexer behave predictably across modes ...

11 years, 8 months ago (2012-09-02 11:52:46 UTC) #1

Keith

http://codereview.appspot.com/6493072/diff/14/input/regression/page-spacing-nonstaff-lines-independent.ly File input/regression/page-spacing-nonstaff-lines-independent.ly (left): http://codereview.appspot.com/6493072/diff/14/input/regression/page-spacing-nonstaff-lines-independent.ly#oldcode11 input/regression/page-spacing-nonstaff-lines-independent.ly:11: \addlyrics { high \skip2 } On 2012/09/02 11:52:46, dak ...

11 years, 8 months ago (2012-09-02 17:59:53 UTC) #2

http://codereview.appspot.com/6493072/diff/14/input/regression/page-spacing-n...
File input/regression/page-spacing-nonstaff-lines-independent.ly (left):

http://codereview.appspot.com/6493072/diff/14/input/regression/page-spacing-n...
input/regression/page-spacing-nonstaff-lines-independent.ly:11: \addlyrics {
high \skip2 }
On 2012/09/02 11:52:46, dak wrote:
> It is also not clear why c5 should not be
> a valid identifier when ce5 is.

Both /can/ be valid identifiers, if we want.
c4 = {...}   at top-level defines a variable, then inside any braces for
music-entry:
c4  is a quarter note on do.
\c4  references the variable.

More simple than today, when I may use 'recs' as an identifier name, unless I
speak Spanish.

http://codereview.appspot.com/6493072/diff/14/input/regression/ragged-bottom-...
File input/regression/ragged-bottom-one-page.ly (right):

http://codereview.appspot.com/6493072/diff/14/input/regression/ragged-bottom-...
input/regression/ragged-bottom-one-page.ly:13: \repeat unfold 16 { c'4 }
On 2012/09/02 11:52:46, dak wrote:
> Why are the braces needed here?

Maybe not strictly needed here, but showing the concept of always putting
music-with-durations inside at least one level of bracing.

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll
File lily/lexer.ll (left):

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll#oldcode397
lily/lexer.ll:397: <chords,notes,figures>{RESTNAME}/[-_]	|  // pseudo backup
rule
On 2012/09/02 11:52:46, dak wrote:
> Did you check that r-. does still work as intended when removing this rule?

The syntax never allowed new identifier definitions in sequential/simultaneous
music.
 \relative c' { new-variable-name = { c d e f } } % never valid
So there lexer need not look ahead for un-escaped WORDs while scanning note and
rest entries.

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll
File lily/lexer.ll (right):

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll#newcode34
lily/lexer.ll:34: flex -b <this lexer file>
On 2012/09/02 11:52:46, dak wrote:
> Did you do this?

A long while ago; I haven't checked it lately,

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll#newcode476
lily/lexer.ll:476: {A}+	{
On 2012/09/02 11:52:46, dak wrote:
> So words no longer correspond to commands regarding their syntax in note mode?

COMMAND is still \WORD so they strictly correspond.

Only COMMAND is a token in note-mode, not WORD, because we are not defining new
identifiers here.

Previously, WORD was matching note-names, but that was wrong because in { c d
e_sharp f } 'e_sharp' is a valid WORD, but not a valid note-name.

http://codereview.appspot.com/6493072/diff/14/lily/lexer.ll#newcode859
lily/lexer.ll:859: push_note_state (nn);
On 2012/09/02 11:52:46, dak wrote:
> What is this supposed to do?  INITIAL state is not supposed to have
pitchnames
> defined.

To fully-bake this implementation, I would want INITIAL to have pitchnames
defined, so I can use INITIAL at top level to recognize the pitch in  \relative
do' {}.

Then entry into note-mode would not need to push onto pitchname_tab_stack_

http://codereview.appspot.com/6493072/diff/14/lily/parser.yy
File lily/parser.yy (right):

http://codereview.appspot.com/6493072/diff/14/lily/parser.yy#newcode1190
lily/parser.yy:1190: '{' { parser->lexer_->push_maybe_note_state (); }
music_list '}'
On 2012/09/02 11:52:46, dak wrote:
> Pushing a separate "maybe_note_ state for _every_ braced music list? 
Seriously?

For every nested level of braces.

Keith

On 2012/09/02 11:52:46, dak wrote: > It looks like some _severe_ doctoring around with regard ...

11 years, 8 months ago (2012-09-02 18:02:35 UTC) #3

dak

All in all, this creates so many loose ends and problems of the kind I ...

11 years, 8 months ago (2012-09-02 18:22:14 UTC) #4

Keith

The version "digits, but not at the end" lets us read vn2_meas345ff = \relative c' ...

11 years, 8 months ago (2012-09-03 18:08:23 UTC) #5

dak

On 2012/09/03 18:08:23, Keith wrote: > The version "digits, but not at the end" lets ...

11 years, 8 months ago (2012-09-03 20:07:07 UTC) #6

bernard_marcade.biz

On Mon, Sep 03, 2012 at 08:07:07PM +0000, dak@gnu.org wrote: > > flex documentation is ...

11 years, 8 months ago (2012-09-03 20:35:35 UTC) #7

Keith

On 2012/09/03 20:07:07, dak wrote: > flex documentation is pretty clear about backing up being ...

11 years, 8 months ago (2012-09-03 23:00:25 UTC) #8

Keith

While we are thinking about this, I suggest we remove (later) the rule forbidding backing-up ...

11 years, 8 months ago (2012-09-05 06:59:16 UTC) #9

dak

On 2012/09/05 06:59:16, Keith wrote: > While we are thinking about this, I suggest we ...

11 years, 8 months ago (2012-09-05 07:50:27 UTC) #10

Keith

On Wed, 05 Sep 2012 00:50:27 -0700, <dak@gnu.org> wrote: > On 2012/09/05 06:59:16, Keith wrote: ...

11 years, 8 months ago (2012-09-05 09:26:12 UTC) #11

dak

On 2012/09/05 09:26:12, Keith wrote: > Agreed, > but I'll still pout a couple more ...

11 years, 8 months ago (2012-09-05 09:51:17 UTC) #12

janek

On Wed, Sep 5, 2012 at 11:51 AM, <dak@gnu.org> wrote: > If it had been ...

11 years, 8 months ago (2012-09-05 10:52:02 UTC) #13

lemzwerg

LGTM. Nice idea. I'm not sure whether this fits into the large picture w.r.t. syntax ...

11 years, 6 months ago (2012-10-29 06:20:17 UTC) #14

dak

On 2012/10/29 06:20:17, lemzwerg wrote: > LGTM. Nice idea. I'm not sure whether this fits ...

11 years, 6 months ago (2012-10-29 10:05:30 UTC) #15

On 2012/10/29 06:20:17, lemzwerg wrote:
> LGTM.  Nice idea.  I'm not sure whether this fits into the large picture
w.r.t.
> syntax normalization as envisioned by David, but at least for me it looks
> reasonable.

Well, I replied on the Google code review as well.

In a manner, this is the kind of issue that would make it convenient (or at
least time- and worry-saving) for me to have the sort of Linus-like dictatorship
that the Linux kernel has.

With most syntax proposals, "I won't do this myself" is likely enough to let the
efforts sizzle out eventually.  This is basically the "traditional" approach
applied by Han-Wen and Jan, but I don't consider this really an upright way of
dealing with things.  I have to concede that this approach made them cooperate
well (or rather not cooperate, with nobody being particularly unhappy or worried
about that) with Graham, something which I failed doing satisfactorily.

Now Keith is not as easy to shrug off as that.  He delivers code along with his
proposals.

With regard to syntax changes and/or additions, of course I am the most guilty
party for opening cans of worms.  Unifying word syntax across modes was perhaps
the most disruptive in this area.  The result was that "a word is a sequence of
letters (including any non-ASCII character by definition), possibly interspersed
with single - and _ characters".  This is a unification of the word concept that
was different in INITIAL and music mode.

Now Keith's proposal boils down to "a word is a sequence of letters (including
any non-ASCII character by definition) possibly interspersed with single -, .
and _ characters, where behind a . there may also come a digit sequence possibly
followed by further word constituents optionally starting with -, . and _
again".

So far, so bad.  Now hooking onto the recent Context.Grob changes probably
inspiring this proposal, we have the situation that Context.Grob is equivalent
to Context . Grob or to Context . $(string-append "Gr" "ob"), that is, . is
acting as an operator here joining several expressions possibly created in
different ways.

Keith's proposal would not imply that violin . $(+ 1 1) would be the same as
violin.2 and not even violin . 2 would work here.

The superficial similarity with the dotted syntax breaks down as soon as you try
putting it to the test.

So in the category "does this change lead to a greater consistency of LilyPond
syntax in itself", which is more or less the metric I use for justifying
invasive changes to myself, this change introduces complex rules and analogies
to existing constructs that work out only at a particular superficial level.  It
makes LilyPond harder to understand, not simpler.

I can wave around my long-term plans which would allow for just writing \violin1
by allowing arrays of violins (I have something in a branch right now, but
without further syntax changes I am working on it is not really fitting
seamlessly into LilyPond).  But long-term plans are not really a suitable excuse
for blocking other developers indefinitely.

Personally, I consider digits in identifiers not worth screwing LilyPond over. 
My \"violin1" patch is something I don't consider really a fabulous idea: its
main incentive is quieting the demands for more invasive changes that will be
quite harder to sort out in its consequences regarding documentation and future
changes.

It seems that my goal of quieting calls for more invasive changes has actually
failed.

I expect this to cause a lot of trouble.  That is not an absolute
counterargument: I expect some of my changes to cause quite a bit of trouble as
well.  What is, in my book, making a difference in the evaluation is that this
trouble, which undoubtedly will tend to end up primarily at my own doorstep,
buys us a purely cosmetic change without any functional difference.  violin.1
will not be able to have the 1 calculated rather than spelled out.  It really
just is part of the identifier without numerical meaning.  a.b is equivalent to
a . b, but a.1 would not be equivalent to a . 1.

So while I can't class this is "impossible", I do consider it as "too expensive"
in terms of explaining it to the user, and in terms of dealing with followup
consequences that will be mostly my responsibility.

Now I am perfectly well aware that if "David feels this is a bad idea" is
supposed to hold any reasonable amount of power, we are essentially back at a
dictatorial situation where I maintain power by virtue of being able to make the
most ominous threats (basically, the model of modern representative democracy). 
Which is, in a way, again cheating our self-chosen systems in a manner I don't
appreciate.

So I hope that Keith does not view my repeated objections to proposals in this
issue as a disregard of his work on them.  I hope, pompous as this may sound,
that he can view this as a sort of learning situation where his detailed
proposals result in more detailed feedback about how the choices in my sort of
revolutionary conservatism I tend to exhibit in program design tend to come
about.

This proposal is clever, and the basic opportunity is well-spotted.  I just
think it is a bit too clever when we are taking into account the restrictions it
still has and the consequences to related areas and the documentation.

\"xxx1" has pretty much the same restrictions and underlying ugliness (probably
a bit more) and unnecessity.  It is just much cheaper and well-confined.  It
does not have the "this will be a lot of trouble later on" ring to it.

Keith

11 years, 6 months ago (2012-10-30 04:13:30 UTC) #16

On 2012/10/29 10:05:30, dak wrote:

> Keith's proposal would not imply that violin . $(+ 1 1) would 
> be the same as violin.2 and not even violin . 2 would work here.
> 

I didn't think we wanted such things.
Nor do we want 
  \paper { short-indent = 3\cm }
to subtract 'indent' from 'short'.  We are simply accommodating other language
conventions, Scheme or human, in acceptable names.

We can choose a different character to introduce digits
 \violin+1  \violin01  \violin,1  \violin;1  \violin:1
as in patch set 3, for example.

Expand All Messages | Collapse All Messages