Friday, October 2, 2015

"Data in a Major Key": MIT Spectrum, FSU

MIT Spectrum has an article by Kathryn M. O'Neill on my work, music21, and computational musicology:
“IF I WANT TO KNOW how the guitar and saxophone became the important instruments throughout classical repertory or how chord progressions have changed, those are questions musicology has been unable to approach,” says Associate Professor of Music Michael Cuthbert. Spotting trends and patterns in a large corpus of music is nearly impossible using traditional methods of study, because it requires the slow process of examining pieces one by one. What his field needed, Cuthbert determined, was a way to “listen faster.”

In other news, Clifton Callender at Florida State University is currently teaching a doctoral seminar on music theory techniques using music21.  His course description is at

Monday, September 28, 2015

Music21 v. 2. (non-beta) released!

The long-awaited (at least by me) version 2 of music21 is released!  This is the first version of the v.2 release to be out of beta and stable enough for general use by everyone.

Upgrade with:

    pip install --upgrade music21

Or download from GitHub.

The first non-beta release of music21 since v. 1.9.3 (June 2014) gives a ton of new features and lots of new speed.  But being a major release change number, it also has some changes that every programmer using the system needs to be aware of.  The release notes on GitHub gives all the details, but here are the highlights since 1.9:

Changed and Added features

* Duration and Offset now use Fractions when necessary for exact representation of tuplets. Many, many errors from rounding are gone.  For now, you can use Duration.quarterLengthFloat and offsetFloat to get the old behavior, but float(Duration.quarterLength) and float(offset) are better.
* Converters support easy to install custom sub converters. MEI is now supported (thanks to McGill university)
* Python 2.6 is not supported.  Python 3.4 is highly recommended; 2.7, 3.3, and 3.5 also work.
* Loading cached streams is extremely fast. All streams are automatically cached when loaded from disk.
* Sorting is much more consistent and faster
* MusicXML parsing and showing have been rewritten to use cElementTree and many new features.
* Stream's internal mechanisms have been hugely rearranged.  Now offsets are stored inside Streams instead of inside Notes, etc., making lots of things faster and more reliable.
* Streams support filters on iteration using the `.iter` property and the `recurse()` method.  These are big changes for speed and reliability.
* Namedtuples replace anonymous tuples in many places
* Music21 is available under the BSD license.
* Musedata files are no longer available in the corpus. However, new files in MusicXML format have replaced several of them.
* Complete rewrite of TinyNotation making it much easier to subclass for your needs.
* If you have MuseScore 2, try'musicxml.png') to get a beautifully rendered musicxml file. Or use .pdf to get something ready to print.  Thanks Nicholas, Thomas, and Walter!
* Builds are automatically tested for errors and documentation coverage.
* Experimental modules moved to the `alpha` sub package.  `demos` reorganized.
* Lots of documentation changes!
* Obscure and almost never used (or actually never used) methods and attributes have been removed.
* Did I mention how much better the documentation is getting?

In case anyone is keeping track, since v.1.0 (June 2012), here are the:

Biggest changes between 1.0 and 1.9

* Store complete Streams via FreezeThaw
* Output to Vexflow and `music21j`
* Converters have been moved into packages.
* It takes 1/3 the time to do most operations, and 1/4 the time to start up.
* Capella supported.  ABC imports almost everything. Humdrum supports multiple voices. Chords have a better root() algorithm
* Many, many new corpus pieces.
* Layout support.
* Python 3 supported, and now recommended.
* Timespans make .getContextByClass at least an order of magnitude faster, letting music21 handle huge scores.
* Derivations reduce the number of Streams to keep track of.

Oh, and I did more than patch bugs in the last week:

Release notes since 2.0.11

* Streams use .iter and .recurse() in TONS of functions, making many a lot faster, a few a bit slower, but all cleaner to debug and safer.
* Deprecated items now return a deprecation warning.
* Duration objects now have a `.client` which can inform the `Note` of changes to it.
* `.classes` searches are way faster. Returns tuple.
* `deepcopy` is about 30% faster.
* `common` is split into a directory of related functions.  Now worth looking through.
* all corpus files, including small .abc files with non-standard additions, now parse.  A complete should be possible without any try: statements.
* several bugs in musicxml processing (mainly related to the handling of expressions, noteheads, etc., on chords) have been fixed.  Also Finale's `` tag is supported.
* code is much more "lint-free" catching many subtle bugs.
* audioSearch is cleaned up, with beta-type code moved to demos.
* Documentation much improved including three new User's Guide sections, and (thanks to bagratte) fixes for UTF-8 errors.
* `` replaces `` for better non-Western script handling.
* .egg files are no longer distributed.  I'll work on getting .whl (wheel) files soon, but for now use .tar.gz.  PyPi no longer supports .egg, so there's no reason for them.

 incompatible changes

* `.fullyQualifiedClasses` is GONE. No one used it.  Instead a new `.classSet` replaces it for rapid class searching.
* sites.Sites and sites.SiteRef are no longer imported into base by default.
* `documentation` modules reorganized, with better examples.
* `stream.core` moves several core modules out of the `stream` module.
* `Volume.parent` renamed `Volume.client` to match `Derivation` and `Duration`
* `.components` on `Duration` now returns a tuple.

What's Next?

Today also announces the first commit of music21 3.0 -- for the first time, I'm going to try to do something daring: keep bug fixes and some backwards compatible changes in the 2.1 (2.2, etc.) branch, but go forward with bigger changes in a 3.0-alpha branch.  Some things that you might expect to happen:

* All deprecated functions will be gone in 3.0; like immediately; like I'm deleting them as I type.

* Lots of things that currently return a Stream will instead be iterators over Streams.  These include: .getElementsByClass(),  getElementsByOffset() -- the fact that so many streams get created is one of the biggest headaches and reasons why the system gets slow.  You can prepare for the change by examining your usage of these functions and asking yourself, "Am I actually using this as a Stream? Or just as a bunch of objects to iterate over in a for loop or to count using len()"?  If the latter, you're fine.  If the former, go ahead and add .stream() after it, for instance filteredStream = s.getElementsByClass("TimeSignature").stream().  The last `.stream()` call does NOTHING right now, but it will ensure that your code works exactly the same after the change happens.  If you want to use the new features (even in 2.1) add .iter between `s` and `.getElementsByClass()` (but leave off the `.stream()`.  You'll find that life will be going a lot lot lot faster.

* I'm going to make a second attempt to use TimeSpans as a general storage engine for Streams.  These are the super fast representations of Streams that Josiah Oberholtzer made, that speed up working with large streams by 10-100x. But for very small streams (such as one measure of a Chorale), they are much slower than the current Streams. Now that all the core mechanisms are factored out of Stream into StreamCore, I can play much more easily with switching in any out the backend functions. Using the lessons of Python's TimSort, I'll probably have the TimeSpan core kick in immediately when there are more than 64 elements in a Stream; it should be seamless except for a tiny delay when the 65th element is added (like shifting gears in a car).

* I may make Python 3.4 a requirement.  We'll see... I'm sick of coding for Python 2.  Python 3 is much more fun from the coder's perspective.

Thanks everyone for great support! -- Myke

Monday, September 21, 2015

Music21 v. 2.0.11 released

Ten days since the last release, so time for a new one. Again, speed, stability, and new features. The biggest change is the entirely new MusicXML output system to match the entirely new input system introduced last release. 
The second biggest is in the (re)introduction of StreamIterators and RecursiveIterator. I'll need to get some demos up of this soon, but this will be a game changer for some tasks.
Update or install with one (or both) of these commands: 
pip install --upgrade music21
pip3 install --upgrade music21

Bigger changes
  1. MusicXML now uses the faster, more reliable ElementTree output generator. Please report any bugs on import or export, especially if they are regressions from format='oldmusicxml'oldmusicxmlwill disappear soon.
  2. Better docs (see below), especially for the long under documented recurse function. Everything that was in overview is now in the User's Guide.
  3. Streams now support filters on iteration -- if you have been using: for e in s.getElementsByClass('X'), try: for e in s.iter.getElementsByClass('X') for a major speedup, especially if you just want the first one or something of that sort. Recurse() supports the same, so for e in s.recurse().notes.getElementsByGroup('tuba') will be WAY faster than before. You might not notice the difference on your own work, but internally things are getting a lot faster. (obscure non-filter routines will be deprecated and disappear soon).
  4. Corpus docs/indexes, etc. are updated with more recent corpus changes (nothing new, but easier to find).
  5. Use of deprecated functions now generates a warning. This should help people plan for migration in case you're not reading the documentation religiously.
Smaller changes
  1. Documentation is improved and updated working with Jupyter/IPython 4 (note: a bug in nbconvert + pandoc requires pandoc v. 1.33 or older to make; they're working on a patch). Docs build in parallel, so it's very fast -- you'll see updates more often.
  2. Documentation is now separated into "source/" and "autogenerated/" folders -- everything in source is user editable. Nothing in autogenerated is.
  3. A number of obscure, long deprecated functions are gone, the biggest being n.removeLocationBySite() use n.sites.remove()
  4. normalization in features has been fixed (Thanks Frank Zalkow)
  5. Parsing of cappella MusicXML files has been improved.
  6. Improved parsing of RomanText files; bugs in several encodings of rntxt and abc files have been fixed.
  7. common.nearestCommonFraction has been renamed addFloatPrecision to better reflect what it does. This has always confused me.

Friday, September 11, 2015

Music21 v2.0.10 beta -- MusicXML Reading

This post announces the v.2.0.10 beta release of music21, which is moving quickly to the official v.2 release, v.2.1.  Some of the changes have already been announced on the music21list Google Groups mailing list.

Upgrade by downloading from or by running "pip install --upgrade music21"

The major changes include:

  • New parsing engine for MusicXML (see below)
  • DurationTuples replace DurationUnits
  • Percussion clefs and No Clefs now are supported properly in musicxml output
  • Improvements to the RomanText and clercqTemperly formats (thanks DT!)
  • Some obscure modules removed from the main namespace:
    1. intervalNetwork becomes scale.intervalNetwork and BoundIntervalNetwork becomes simply IntervalNetwork.
    2. scala becomes scale.scala
    3. chord becomes a package and chordTables becomes chord.tables 
  • In the next version, expect languageExcerpts to become text.languageDetection and the "xmlnode" module to disappear.
  • Environment and CapellaXML, which depended on XMLNode now don't.  CapellaXML processing is 10x faster.
  • jsonpickling is upgraded and safer.
  • Building documentation now works on IPython 4/Jupyter 4.0
  • MusicXML output with Unicode now works on Py3 (thanks Sarig!)
  • Spanners on Rests now export properly in MusicXML
  • VexFlow only supports the music21j based output now. More bug fixes there to come (or will be moved to alpha support)
  • Everything overall is about 30% faster than a month ago.

The biggest change in this version is how MusicXML is processed.  When Christopher Ariza joined the music21 team in 2008, music21 had a tiny limitation: it didn't work with MusicXML, at all. Whoops! It was just too big a task to tackle for me when I was still figuring out how Streams, Sites, Durations, etc. would work. Thankfully Chris took it on and extremely quickly produced a great parser for MusicXML.  The problem back then was that few people were on the latest, greatest version of Python 2.5, and music21 aimed to support at least back to Python 2.1, and only the newest Python 2.5 had the brand new "ElementTree" Python processing module (and there were still substantial bugs in that module before Python 2.6).  We were determined not to make MusicXML parsing require an external library such as "lxml", so that left two choices, xml.minidom and xml.sax.

Anyone who knows anything about the structure of MusicXML and the differences in philosophy between DOM and SAX will know that DOM is the logical choice for MusicXML parsing -- it allows nodes to look at their neighbors, parents, children, and make logical decisions (am I a note, rest, or chord?) based on the context.  SAX on the other hand is built on calling functions whenever a particular tag start is encountered, whenever data is encountered, and whenever a stop tag is encountered. Great for certain types of text formatting, insanely difficult for a format like MusicXML (or MEI or just about any music format besides perhaps MIDI).  So, if memory serves, Chris wrote a quick DOM processor for MusicXML and it was getting notes, durations, measures, beautifully.

But Chris Ariza is also probably the best programmer I've ever met and before going further he profiled the system and extrapolated what it would be like to work with a large corpus of MusicXML files using it.  Slow as slime.  The minidom was implemented entirely in Python, not highly optimized, and was not going to make anyone want to use MusicXML in the toolkit.

So, he basically did the impossible: implemented a blazingly fast SAX processor for MusicXML that built a close-to-the-original representation of the file (musicxml.mxObjects) and then processed that in a much more friendly format.  Bam! Speed went up by an order of magnitude, and everything that music21 could do with MusicXML was born.  In the dozens of releases since he moved on from the project, I've barely had to touch the internals at all even as the rest of the system has expanded and changed dramatically. And there was a system for caching the mxObjects representation for a speedup in the next parse.

Fast forward 7 years.  Python has changed.  Version 2.7 is now the minimum requirement (it's over five years old already; we just found a check for Python > 2.2 somewhere in the system! removed it) V.3.3 and 3.4 are supported (3.5 should be out this week and of course will be supported).  And everyone has access to xml.etree.ElementTree now. And the final representation of all parsed formats is now cached, so there is no need for the mxObjects cache.   So in the interest of simplifying parsing (and getting a 40% speedup over SAX + mxObjects), it made sense to rewrite the MusicXML parsing engine.

The new version is called musicxml.xmlToM21.  There are a few miscellaneous files in a new musicxml.xmlObjects file, but basically all the parsing takes place in the xmlToM21 file.  Every tag in musicxml is now written directly into the file to make it easier to see exactly which tag is causing any particular problem. (Line number properties may be possible to add soon).  Because the format of the parser is now much closer to the format of the MusicXML document, a TODO: has been added for every missing tag, or attribute.  Expect music21 to support every tag and attribute in MusicXML 3.0 sometime soon.  If you've ever wanted to hack additional support into Music21's MusicXML parsing but it seemed too daunting, give another look at the code now.

This is a major change on the most used format for music21. Thankfully, Ariza wrote so many tests into the system that I am relatively confident that everything now works exactly like before.  The exceptions are: non-printed notes are no longer skipped (this was to prevent the next bug), notes with incorrect divisions are now corrected rather than skipped, and spanners preceding rests are now attached to the rest rather than the next adjacent note.  (My intention was to be 100% compatible with before, but it would've been very hard to replicate this incorrect behavior).  The one negative side-effect you will see is that parsing some of the Beethoven files is now slower (rather than 40% faster) because some of those files used a large number of incorrectly notated, non-printing notes to represent playback of trills.  For certain files (such as the Gro├če Fuge) the number of notes in the score will almost double with the new system.

Because this change is major, for now you can still use the old parsing system via converter.parse('filename.xml', format='oldmusicxml').  I suggest also adding "forceSource=True" to make sure that you are reading the file from disk and not from Cache.

I'm extremely excited by this change -- we will get the writing of music21 files to use the new system by the next release (a much easier task).

As always, music21 has been supported by the Seaver Institute, the NEH Digging into Data grant, and MIT Music and Theater Arts/SHASS.

Durations and DurationTuples

If you don’t do a lot with Tuplets (Triplets, etc.) and have never heard of a DurationUnit, this is a post to skim. :-)

Music21 has the ability to consider two different incompatible ideas of a note as a note.Note (or note.Rest, chord.Chord, etc.): a note on a page and a note as heard.

(1) When we write a note on a page, it has certain properties: stems, flags, beams, dots, ties, tuplets, etc. Consider a half note tied to an eighth note.  On the page these are definitely two notes — we read them as two notes and then sound them together because of the tie.

(2) Now consider a single sound from a trumpet, at quarter note = 60, which lasts 2.5 seconds.  This is a note.  One note.  But try to notate it.  You’ll find that it takes two notes on the page to notate it: a half note and an eighth note, tied together.  But when we hear this sound, it sounds like one note.

How can we represent the second type of note in music21 even though it can’t be notated as a single note (well, not normally; more on that below)? Simple, we create a single Note object, which has a single Duration object.  That Duration object, however, has two elements in its “.components” list — a component corresponding to a half note, and a component corresponding to an eighth note.  The note’s overall duration has a type of “complex”.  When sending this Note out to MusicXML, Lilypond, etc., we split it into two notes (you guessed it, a half note and an eighth note).  We then look at something called “linkage” to see that each of these notes should be connected by a Tie.  (Rests have no linkage, for instance).  When we send it out to MIDI, on the other hand, we can leave it alone as one Note, since MIDI doesn’t support ties, but does support arbitrary lengths of notes.

So, this has been the music21 model since alpha 1, and in general it remains the model.

What has changed in the newest GitHub repository and will change in the next 2.X release is what these “.components” are.  Up until now they’ve been an object called DurationUnit — an amazingly flexible object created by Chris Ariza that can represent everything that a “simple” duration does; DurationUnits can have tuplets, dot-groups (an obscure medieval term), and just about everything else you can think of.  They’re extremely cool, and I’m going to miss them.

In music21 2.X, the components of a complex duration are called DurationTuples.  They are much simpler objects that only store three pieces of information: the type (‘whole’, ‘half’, ‘16th’, etc., plus ‘zero’), the number of dots, and the quarter length of that component.  They don’t have tuplets, dot groups, etc.  And they’re called Tuples because they derive from namedtuple which derives from the Python tuple object — in other words they are immutable.  Once a DurationTuple is set, it can’t be changed.  To change a note’s duration from whole to half, the DurationTuple needs to be deleted from .components and a new one created and inserted.  So they do everything a DurationUnit does and much less.

So, why the change?  Amazing flexibility and power, such as DurationUnits offer, comes at a price: speed.  And complexity.  The new DurationTuple makes creating the most common type of Duration with a single component much faster.  The amount of time to do: “d = duration.Duration(1.0); d.type” has been cut by over half.  This makes the creation of Notes about 20% faster than before (well, after the first check of durations), which is a pretty substantial improvement.  And as Dmitri and others have noticed, there are a lot of ways to change the duration of a Note that can affect other things, such as Streams.  This change reduces the complexity by making it so that only the Duration object itself can change its duration.  Changes to underlying .components are impossible to make since DurationTuples are immutable.

The only practical effect that most users are likely to see is in the use of Tuplets.  In the past tuplets lived on DurationUnits.  This meant for instance that a Duration could represent a single duration of “half-note-tied-to-eight-note-triplet” (QL: = 2.333333333).  Now, all the components of a Duration need to have the same tuplet or nested tuplets.  So this duration can be represented as a (dotted-half-note + eighth note) triplet.  Or it can be represented as (do the math, but only when you have time) a single whole-note in an 12:7 tuplet (because the latter is easier to determine for the computer to do, that’s what is done right now, but that could change).  

The other practical change in Tuplets is that there are generally three aspects to a Tuplet (well, four, but we’ll keep it simpler): the number of “actual” notes (3 for a triplet), the number of “normal” notes (2), and the durationNormal (‘eighth’, no dots, for instance).  In theory, once a tuplet was attached to a note, it became immutable (frozen), but because normal note was a DurationUnit, it was possible to create the tuplet and then to change the normal note type, or dots, or whatever.  Now that durationNormal references a DurationTuple, it is immutable; so instead of this:

   t = duration.Tuplet()
   t.durationNormal.type = ‘quarter’

or do this:

   t = duration.Tuplet(‘durationNormal’ = ‘eighth')

or if you must:

   t = duration.Tuplet()
   t.durationNormal = duration.durationTupleFromTypeDots("eighth"0))

(this may go away, tuplets might become fully immutable in the future)

I hope these changes make using the system faster without much trouble.

Sunday, September 6, 2015

Speed improvements in music21

Music21 continues to get faster and faster.  The average music21 internal operation takes about 1/8th as long on the same computer as it was when the system was first released. And the processing speed for normal operations is about twice as fast as it was back then.  

Huh? Why only half the time?  We'll, every time we get a speedup, we spend half of it on making the system more robust.  So for instance, here's how long it took to make 10,000 notes in 2008 and 2013:

2008 Sep  ~1.1
2013 Nov   0.777

Well, that was a pretty good improvement. But there were all sorts of problems with tuplets in music21 (especially from MIDI), where, for instance, five quintuplet 16ths could add up to 0.9999 quarter notes.  So we switched to a Fraction module for safety, and we lost the speedups:

2014 Jul   1.126
2015 Jan1  1.154

That seemed too slow, so in January, we undertook a large number of tweaks, described below, and got it down to:

2015 Jan19 0.516

We're still working, so in 2.0.10 when it is released you'll find that 10,000 notes now takes:

2015 Sep6  0.400

This gives a lot of room to play with to start making the system safer and more secure.

Deepcopy performance still leaves a lot to be desired. This will be the next focus.

This article will get updated as the timing improves (or is sacrificed for security).

>>> from timeit import timeit as t

========== Note
#1 Baseline

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

#2 Instantiation Tweaks

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

#3 Deepcopy of Durations

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

#4 Tweaks to Pitch

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

#5 Move imports out of frequently called objects

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

#6 2.0.10 -- 2015 Sep improvements to seeing up durations and sites:

>>> t('n=note.Note()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

========= GeneralNote

# 1 Baseline

>>> t('n=note.GeneralNote()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.GeneralNote()', number=10000)

# 2 Instantiation Tweaks

>>> t('n=note.GeneralNote()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.GeneralNote()', number=10000)

# 3 Deepcopy of Durations

>>> t('n=note.GeneralNote()', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.GeneralNote()', number=10000)

For comparison:

>>> t('n=note.NotRest()', 'from music21 import base, note; import copy;', number=10000)
>>> t('n=note.Rest()', 'from music21 import base, note; import copy;', number=10000)
>> t('n=note.Unpitched()', 'from music21 import base, note; import copy;', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import stream, note; import copy; n=note.Unpitched()', number=10000)

Chords are fast...
>>> t('c=chord.Chord()', 'from music21 import chord; import copy;', number=10000)

But each additional note is 0.5s per 10000
>>> t('c=chord.Chord(["C"])', 'from music21 import chord; import copy;', number=10000)
>>> t('c=chord.Chord(["C","E","G"])', 'from music21 import chord; import copy;', number=10000)


>>> t('copy.deepcopy(p)', 'from music21 import pitch; import copy; p=pitch.Pitch("C")', number=10000)
>>> t('p=pitch.Pitch("C")', 'from music21 import pitch; import copy; p=pitch.Pitch("C")', number=10000)

after tweaks:
>>> t('p=pitch.Pitch("C")', 'from music21 import pitch; import copy; p=pitch.Pitch("C")', number=10000)

Accidentals are .08s

>>> t('n=note.Note("C")', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)
>>> t('n=note.Note("C#")', 'from music21 import stream, note; import copy; n=note.Note()', number=10000)

========= Music21Object

# 1 Baseline

>>> t('n=base.Music21Object()', 'from music21 import base, note; import copy; n=base.Music21Object()', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import base, note; import copy; n=base.Music21Object()', number=10000)

One subclass away (__init__ does nothing but call super(__init__):

>>> t('n=base.ElementWrapper()', 'from music21 import base, note; import copy;', number=10000)
>>> t('copy.deepcopy(n)', 'from music21 import base, note; import copy; n=base.ElementWrapper()', number=10000)

# 2 Sites and Duration improvements (Sep 2015)

>>> t('n=base.Music21Object()', 'from music21 import base, note; import copy; n=base.Music21Object()', number=10000)

Deepcopy is MUCH slower than just creating a new one...

>>> t('copy.deepcopy(n)', 'from music21 import base, note; import copy; n=base.Music21Object()', number=10000)

Tuesday, September 1, 2015

Music21 v.2.0.8 (beta) released

The newest release of the music21 v.2 (beta) developments improves the stability and performance of the system. 
The biggest change in this version is the movement of all experimental modules into a new "alpha" sub-package; in the future, all releases that add something useful to music21 but which are not well tested or mostly documented will begin in this folder. They may graduate into the main music21 mainspace at some later point, remain in "alpha", or be removed. Among the modules that are moved include: webapps (system for running music21 as a WSGI service-oriented architecture), trecento (fourteenth-century musical analysis), theoryAnalysis(common-practice error detection), counterpoint/species (first-species counterpoint generator), medren (miscellaneous pre-1600 applications; this will return soon after refactoring), contour(contour analysis), chant (Gregorian chant generation), and (find scales inside Streams).
The "demos" directory has also been reorganized. The next release will focus on making this directory easier to use.
Bug fixes and improvements:
  • search.lyrics -- modules for searching within lyrics while retaining position information about matches.
  • Interval objects have been improved to have additional properties.
  • .priority changes will automatically re-sort Streams. This change will make .priority more useful.
  • Stream.elementsChanged() is a new method that can trigger a cache clear for Streams.
  • Stream.remove() gets a recurse function.
  • Lilypond color works again (thanks Ringw)
  • Accidental becomes a SlottedObject -- pro: much faster. con: arbitrary attributes cannot be added to Accidentals.
  • MuseScore 2 is now discovered automatically in python3
  • Tremolo support (including in MusicXML)
  • Unlikely bugs in Chord fixed.
  • MIDI support for > 16 channels output.
  • Fixes for PIL/Pillow support of more versions
  • More places taking advantage of exact fractions in music21 offsets and lengths which used to be kludges

Tuesday, June 16, 2015

Parallel Computing with music21

First we start the cluster system with ipcluster start, which on this six-core Mac Pro gives me 12 threads. Then I'll start iPython notebook with ipython notebook.
from __future__ import print_function
from IPython import parallel
clients = parallel.Client()
clients.block = True
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

Now I'll create a view that can balance the load automatically.
view = clients.load_balanced_view()
Next let me get a list of all the Bach chorales' filenames inside music21:
from music21 import *
chorales = list(corpus.chorales.Iterator(returnType = 'filename'))
['bach/bwv269', 'bach/bwv347', 'bach/bwv153.1', 'bach/bwv86.6', 'bach/bwv267']

Now, I can use the function to automatically run a function, in this case corpus.parse on each element of the chorales list., chorales[0:4])
[< 4467044944>,
 < 4467216976>,
 < 4465996368>,
 < 4465734224>]
Note though that the overhead of returning a complete music21 Score from each processor is high enough that we don't get much of a savings, if any, from parsing on each core and returning the Score object:
import time
t = time.time()
x =, chorales[0:30])
print("Multiprocessed", time.time() - t)
t = time.time()
x = [corpus.parse(y) for y in chorales[0:30]]
print("Single processed", time.time() - t)
Multiprocessed 1.7093911171
Single processed 2.04412794113

But let's instead just return the length of each chorale, so we don't need to pass much information back to the main server. First we need to import music21 on each client:
clients[:].execute('from music21 import *')
<AsyncResult: finished>
Now, we'll define a function that parses the chorale and returns how many pitches are in the Chorale:
def parseLength(fn):
    c = corpus.parse(fn)
    return len(c.flat.pitches)
Now we're going to see a big difference:
t = time.time()
x =, chorales[0:30])
print("Multiprocessed", time.time() - t)
t = time.time()
x = [parseLength(y) for y in chorales[0:30]]
print("Multiprocessed", time.time() - t)
Multiprocessed 0.59440112114
Multiprocessed 2.97019314766

In fact, we can do the entire chorale dataset in about the same amount of time as it takes to do just the first 30 on single core:
t = time.time()
x =, chorales)
print(len(chorales), 'chorales in', time.time() - t, 'seconds')
347 chorales in 5.31799721718 seconds

I hope that this example gives some sense of what might be done w/ a cluster situation in music21. If you can't afford your own Mac Pro or you need even more power, it's possible to rent an hour of cluster computing time at Amazon Web Services for just a few bucks.

Music21 v.2.0.5 (beta) released

The newest version of the beta 2.0 track of music21 has been released. A reminder that the 2.0 track involves potentially incompatible changes w/ 1.X so upgrade slowly and carefully if you need existing programs to work. Changes are being made to simplify and speed up usage and make the system more expandable for the future.

Download at or with PyPI.

Major Changes

  • Complete rewrite of TinyNotation. Tinynotation was one of the oldest modules in music21 and it showed — I was still learning Python when I wrote it. It documents a simple way of getting notation into music21 via a lily-like text interface. It was designed to be subclassable to make it work on whatever notation you wanted to use. And technically it was, but it was so difficult to do as to be nearly impossible. Now you’ll find it much simpler to subclass. Demos of subclassing are included in the code (esp. HarmonyNotation, and trecento.notation); a tutorial to come soon.
  • backwards incompatible changes: (1) you used to be able to specify an initial time signature to Tinynotation as corpus.parse(“tinynotation: c4 d e f”, “4/4”); now you must put the time signature string into the text itself, as corpus.parse(“tinynotation: 4/4 c4 d e f”). “cut” and “c” time signatures are no longer supported; use 2/2 and 4/4 instead. (2) calling tinyNotation.TinyNotationStream() directly doesn’t work any more. Use the corpus.parse interface either with the “tinynotation:” header or format=“tinynotation” instead. If you must use the guts, try tinyNotation.Converter(“4/4 c4 d e f”).parse().stream. (3) TinyNotation used to return its own “TinyNotationStream” class, which was basically incompatible with everything. Now it returns a standard stream.Part() (4) TinyNotation did not put notes into measures, etc. you needed to call .makeMeasures() afterwards. If you need the older method, use corpus.parse(‘tinynotation: 4/4 c2 d’, makeNotation=False)
  • Musescore works as a PNG/PDF format. First run: us = environment.UserSettings(); us[‘musescoreDirectPNGPath’] = '/Applications/MuseScore’ or wherever you have it). Then try calling “.show(‘musicxml.png’)” and watch the image arrive about 100x faster than it would in Lilypond. Thanks MuseScore folks! This is now the default format for .show() in iPython notebook. Examples using lily.png and lily.pdf will migrate to this format, so that lilypond can be moved to deprecated-but-not-to-be-removed status. (I just don’t have time to keep up)
  • demos/gatherAccidentals : a good first test programming assignment for students. I use it a lot in teaching.
  • musicxml parses clefs mid-measure (thanks fzalkow)
  • installer.command updated for OS X (thanks Andrew Hankinson) — let me know if this makes a problem.
  • postTonalTools demo in usersGuide.
  • DataSet feature extractor gets a .failFast = False option for debugging.

Under the hood / contributors

  • music21 now uses coverage checking via We are at 91.5% code coverage; meaning when the test suite is run, 91% of all the lines of code are tested. Aiming for 95% (100% is impossible). Adding coverage checking let me find a lot of places that weren’t being tested that, lo and behold!, had bugs. What it means for contributors: any commit that is longer than 20 lines of code needs to improve the coverage percentage and help us get to 95%. So make sure that at least 92% (better 99%) of your code is covered by tests.
  • the romanText.objects module has been renamed romanText.rtObjects to not conflict with external libraries. It’s an implementation detail.
  • added demo of how to subclass SubConverter.

Minor Changes

  • measure number suffixes in musicxml output, not just input.
  • language detector can detect Latin and Dutch language texts now.
  • fix pitch class errors in microtones.
  • midi files with negative durations no long crash the system.
  • bugs fixed in tonalCertainty. You can be more certain that it works.
  • cPickle is used in Python3 now. Faster.
  • midi parsing can specify quantization levels.
  • music21.__version__ gives the version (maxalbert did a lot this commit; forgot to shout out before!)
  • better detection of lilypond binaries.
  • certain Sibelius MusicXML files with UTF-16BOMs can now be read.
  • rests imported from MusicXML would not have expressions attached to them — fermatas, etc. fixed
  • serial.ToneRow() now has the notes each as quarter notes rather than as zero-length notes; it makes .show() possible; backwards incompatible for the small number of people using it.
  • colored notation now works better and in more places.
  • better docs.
  • about a trillion tiny bugs and untested pieces of code identified and fixed by glasperfan (Hugh Z.)


Looking forward to the 2.1 release!

Saturday, April 11, 2015

Crunching ballads (Bibliolore)

The RILM blog, Bibliolore, published a recent post about the early history of computational musicology:

In the 1940s Bertrand Harris Bronson became one of the first scholars to use computers for musicological work.
For one of his projects he encoded melodic characteristics of hundreds of tunes collected for the traditional ballad Barbara Allen on punch cards, so a computer could ferret out similarities. His project resulted in four groups of tunes, members of which came from both sides of the Atlantic with varying frequency.
Read more at their site.  The RILM blog is amazing in any case.

Saturday, March 14, 2015

OSU Empirical Music Research -- Learn from the Master...

The master of empirical music studies and the inventor of Humdrum, David Huron, announced a workshop to be held in late May in Columbus, Ohio -- probably of great interest to anyone working with music21 who wants to put their work into a broader context:

Methods in Empirical Music Research: A Workshop for Music Scholars.
Monday May 18 to Friday May 22, 2015.
School of Music, Ohio State University

We are pleased to announce a workshop on empirical methods in music research. This is an intensive five-day workshop taught by Prof. David Huron.

The workshop will be of interest to anyone wishing to expand or enhance their research skills in music. Participants will learn how to design and carry out music experiments, and how to apply empirical, systematic and statistical techniques to problems in music history, analysis, performance, education, culture, policy, and other areas. The workshop is designed specifically to develop practical research skills for musicians and music scholars with little or no previous background in empirical methods.

The workshop introduces participants to a number of methods, including descriptive, exploratory and questionnaire methods, field research, interview techniques, correlational and experimental methods, hypothesis testing, theory formation, and other useful research tools and concepts. Participants will also learn how to read and critique published empirical research related to music - identifying strengths and weaknesses in individual music-related studies.

The workshop objectives will be pursued through a series of day-long activities, including lectures and demonstrations, interspersed with twenty hands-on and group activities.

Three different forms of registration are available for workshop participants. The fee for non-credit participation is $450. A fee schedule for continuing education, and graduate course credit (2 credit hours) is pending. Participants are responsible for their own transporation, food and accommodation.

For further details, see

Thursday, January 15, 2015

Ten(+) Years of music21 (pt. 1)

An important milestone for music21 passed unnoticed by me a few weeks ago: ten years since the first time that code (well mostly documentation and tests) that still remains somewhere in the music21 codebase was run and did something useful (a scrambled loop for a composition project I was working on called Fine dead sound, after Faulkner).  It seemed worth saying a bit about the history of the project and things learned along the way.

Music21 started as a Perl project -- the oldest code I can find that even vaguely resembles the current music21  comes from November 2003, a 30k Perl script to label a Bach chorale with Roman numerals and correct keys (I still need to publish the algorithm somewhere since I haven't seen a better one). It had no graphical output except using David Rakowski's Lassus font. I was able to get it back up and running and through the modern miracle of web fonts, everyone (except IE users) can play with the oldest  music21  system at: If I remember correctly, chords in red had only a passing functionality, the small letters are the roots of chords and the second line of text is the locally active key. Pivot chords are labeled in brackets. This was done for an assignment in Avi Pfeffer's AI and Music class; others in my group had already produced the code (in C I believe) for removing passing tones from the score.

I had begun that project using Humdrum, a system I knew well and still have great admiration for, but I began to need to encapsulate more and more items into individual classes.  I googled around, because I figured someone would have already gotten far in an object-oriented replacement for Humdrum. But I found nothing, so when I got back to the project about a year later I started creating generalized objects which I incorporated into a package called "PMusic" for "Perl Music." (I think I first called it just "Music" but realized that this was going to be a problem.)  Here is part of the PMusic code for Chords, written on 12/31/2004.  Good documentation was already a music21 principle, but names of notes were still based on Kern/Humdrum:

=head1 PMusic::Chord (class)

class for dealing with chords

  $ch1 = PMusic::Chord->new(['CC', 'E', 'g', 'b-']);

Note double set of brackets, indicating a perl array reference.
Doesnt deal with rests yet.  Chords of more or less than four notes might
be odd...

Can also give an arrayRef of Note objects if you're that perverse...


package PMusic::Chord;
use strict;
use Carp;

sub new {
  my $proto = shift;
  my $class = ref($proto) || $proto;
  my %args;

  ## if only one arg, assume it's an arrayref of Note names
  if (scalar @_ == 1) { $args{Notes} = $_[0] }
  else { %args = @_ };
  my $self  = \%args;
  bless $self, $class;

### for now, first note is always bass of chord (does tenor ever go below bass? sometimes!)

=item NoteObjs()

Returns an array ref of PMusic::Note objects


sub NoteObjs {
  my $self = shift;
  return $self->{NoteObjs} if $self->{NoteObjs};
  my @objs = ();
  foreach (@{$self->{Notes}}) {
    if (ref($_)) { push @objs, $_; }
    else         { push @objs, Note->new($_); }
  $self->{NoteObjs} = \@objs;

=item Bass([note PMusic::Note])

returns the bass note or sets it to note.  Usually defined as the first noteObj
but we want to be able to override this.  You might want an implied
bass for instance...  v o9.


sub Bass {
  my $self = shift;
  $self->{Bass} = $_[0] if defined $_[0];
  $self->{Bass} ||= $self->NoteObjs->[0];
  return $self->{Bass};


At this point I was living in Rome at the American Academy as a Rome Prize fellow in Medieval Studies, and halfway through my fellowship I realized that none of this code was going to write a six-hundred page dissertation on fourteenth-century music fragments, and besides, I had job interviews to prepare for, and later, students to teach at Smith College, Mount Holyoke College, and then MIT, none of which (yes, including MIT) cared that I was working on generalized software for musical analysis as a side project. I also saw how long this project was going to take.  And besides, I was still convinced that someone had already written a generalized OO toolkit for musical analysis, I just wasn't searching well enough.  So the project was put aside for almost two years before it reemerged in Python, as the (to be written) Part 2 explains.