Whiteknight's World: November 2009

Sunday, November 22, 2009

Running out of Free Time

I'm slowly running out of free time recently. Ongoing computer problems have diverted an unfortunate amount of my effort recently, and I haven't been able to develop as much as I would like. It's very hard to get into the swing of a hardcore coding session when your computer freezes or spontaneously reboots every 30 minutes on average. I saw "on average" because sometimes it's much more frequent than that. I had to reboot my computer twice while writing this first paragraph. Thank goodness for Blogger autosaving my drafts.

On top of that, we're having a kid. An entire new kid. The due date was on Friday, so now we're officially in over-time and very much looking forward to having an "outside" baby to play with. We both fully expect to be spending some serious time in the hospital this weekend or next week, with sooner being better than later. I've never had any of my own children before, so I'm not entirely certain how it's going to change things. But I am certain that they will change, and probably not in a way that gives me more time to hack on Parrot.

Despite my lack of time, this morning I started work on a brand new secret project. I'm not going to give any details about it quite yet (I want to get some more aspects of the design worked out before I go showing off my work), but suffice it to say that if things work out the way I hope they will this project could provide a significant benefit to Parrot and its ecosystem. I'm intending this project to be a Christmas present to the Parrot project, so I hope I can get it done by then. More details to come.

I'm also trying to do some work to get :call_sig working properly. Matrixy is definitely going to need improvements there, and I'm sure other projects will want it too. If I can keep the changes small I will just commit it directly. If it starts getting too large, I'll make a branch instead.

Matrixy and Parrot-Linear-Algebra are going to take a back seat for now while I focus on other issues and my new secret project. I've got a few cleanups I want to make and some tests to add of course, but no huge development for a little while.

Thursday, November 19, 2009

Parrot Project Proliferation Part 3

This is part 3 of my series on the cool new Parrot projects that are popping up around the interwebs. Today I'm going to introduce Markdown, NQP-RX, and nqpTAP.

Markdown

Markdown is a text markup syntax that's designed to be easy to read and edit. In some ways, it's like wikitext, except Markdown has been driven by a consistent design philosophy while wikitext has grown in a platform-dependent ad hoc way.

Parrot has it's own markdown engine now, courtesy of fperrad. It converts markdown input into properly-formatted valid HTML output. And the best part is that it runs on pure Parrot. So now, with all your cool websites you're making with mod_parrot, you can use this markdown engine to format text.

NQP-RX

It's not exactly a small project, but NQP-RX really deserves some attention. It's a rewrite of NQP and PGE from the Parrot repo, but properly integrating the grammars into the NQP language and enabling a lot of cool new features that "classic" NQP doesn't have. On top of that, NQP-RX properly bootstraps: It knows enough syntax in order to parse itself (after it's already been built from PIR source, of course). That's no small feat for a program written in the Parrot equivalent of assembly language.

The old NQP is still hanging around in the Parrot repo like it always has, and projects that were relying on NQP will still be able to work with it. However, the new NQP-RX is developed on github and snapshots of it are kept in the extensions directory in the Parrot repo too.

nqpTAP

Every project in the Parrot ecosystem that I have seen makes extensive use of unit testing. Some projects are test-driven, though the majority seem to use post-facto tests for verification and to prevent recursion. Whatever the purpose, tests are everywhere and the TAP standard is used by almost all of them.

the nqpTAP project, started by dukeleto and based on his work in Plumage, is a pure-Parrot TAP harness that executes tests and summarizes results without depending on anything besides Parrot itself. Keeping dependencies low is always a good thing, and nqpTAP helps to reduce the barrier to new projects looking to create a proper test suite. Even better, nqpTAP targets the new NQP-RX, so it will be stable and working long into the future.

These three projects are very interesting, and I think it's worthwhile to give them at least a first look.

Wednesday, November 18, 2009

Trac Ticket Hunt

Yesterday, after a herculean effort, the Parrot devs closed out the remaining old tickets from the RT system. Many of the tickets were vague and uncloseable, or unreproducible. Many of them were translated into Trac tickets for futher monitoring.

Of course, we still have a lot of open tickets in Trac. Over 600 of them, as Coke pointed out in an email this morning. That's quite a huge number of open issues, and really too many for our current development team to deal with in a reasonable amount of time. We need help dealing with this huge backlog, and this is a great opportunity for new interested people to get involved in Parrot.

Preparing for 2.0

The 1.8 release went out the door on Tuesday, and I think it went much more smoothly than 1.7 did last month. Not completely without hiccups, but better. Now we're in the home stretch for the big 2.0 release in January where the mantra is "production ready".

What does it mean to be production ready? First and foremost I think of stability and reliability. Nobody is going to invest time and effort in software that isn't stable. Next, I think about performance. Computer hardware isn't cheap, and we can't be shipping a piece of software that hogs processor cycles and costs companies more money to support.

With these goals in mind (and I would love to hear what other people think "Production Ready" means), I think there are two big things we need to focus on: Testing and Profiling.

Testing

Test reports are good, and we're starting to get a very large volume of test reports flowing in, including test reports on new or exotic system. Bravo to anybody who has set up an automatic build bot in the past few months. It is sincerely appreciated.

Test reports are a good and necessary first step, but are by no means the end. Tests are good when they all pass, but that's boring (and unfortunately, it's not usually the case). What's really interesting and important is finding the failing tests and writing up Trac tickets for them so they can get fixed. So here are some things you can do to help:

Monitor the stream of incoming Smolder reports and look for failures
If the failure is happening on a platform that you can test on, try to verify it
See if you can isolate the code from the test that is failing. Bonus points: See if you can write your own small test that demonstrates the bug. The smaller the better.
Open a Trac ticket including information about your platform, the revision where the failures first start appearing (as close as you can tell), and any test cases you've produced that exercise it.
If you're able, submit patches to fix the issue, patches to improve testing of the issue, or patches for documentation to explain what the desired behavior is

All of these things would be very awesome, and are all great ways to get involved in Parrot without having to dive into the source code head first.

Profiling and Benchmarking

Let's not fool ourselves: Parrot is not speedy fast right now as far as VMs are concerned. We don't have JIT, we don't have a good GC, we don't even have PIC. We don't do enough caching, our startup time is still terrible, etc. There are lots of big optimizing features that we need to implement in the coming months and years if we truly want to be a viable and even formidable alternative to other VMs.

However, this all doesn't mean that the only performance benefits that we need come from these huge projects with weird acronyms. There are plenty of small implementation and algorithmic improvements that we can make throughout to start slimming this bird down, some of which will have serious effects on our performance. Parrot is so large though that we can't necessarily find all these little optimization opportunities easily. We need to narrow the search. This is where profiling and benchmarking come in. This is where we need you.

Parrot has a fancy new profiler tool that can be used to profile programs written in PIR or any of the HLLs that run on top of Parrot. It still needs lots of documentation, but it should be mostly easy to use for people willing to poke around in it. If you can find a good example program that demonstrates some real-word usage patterns, we would love for you to profile them and send us the reports. Knowing where the bottlenecks and slowdowns are will help us to target, refactor, and improve them, and that's a big help.

To start up the profiler, run Parrot with this incantation:


> parrot -Rprofiling

For more information about what to do with it, hop onto the IRC channel and ask around. I haven't used this much myself, but it would be cool to get started.

To prove that we are indeed making things faster, we need benchmarks. Good benchmarks are programs that perform lots of very repetitive work and target a particular Parrot subsystem. We want programs that really exercise Parrot, and can do it in a consistent way. Then, we can use timings on these benchmarks to show whether Parrot's performance is improving or getting worse over time. This is very important.

Ticket Triage

As Coke mentioned in his email, we can't sit back and congratulate ourselves now that RT is empty. We need to focus our attentions now on the growing backlog of tickets in Trac. Some of the issues documented there are very serious and will definitely prevent Parrot from being stable and "production ready" by 2.0.

As Coke outlined, we need people to go through old tickets and answer a few questions:

Can we reproduce this issue now with Parrot 1.8.0? Many tickets were filed weeks or even months ago, and may have disappeared in the course of normal development
Look at RFC tickets (requests for comments) and weigh in. Do the changes described make sense? Would they be beneficial? Many of these tickets are simply waiting for some kind of discussion before they get closed.
If the ticket involves an HLL, see if you can reproduce the issue using pure-PIR code instead of high-level code. Parrot operates on PIR natively, so Parrot developers are most easily going to be able to fix problems that can be demonstrated in PIR
If you see a ticket with a segfault, sigbus, sigfpu or other system error condition, see if you can provide a backtrace.
If a ticket contains a patch, see if the patch still applies cleanly to trunk. If so, see if the patch fixes the problem.
Add comments or set other information to make sure the ticket stays up-to-date and informative. Even if the information you add is small ("Still a problem!" or "fixed!"), that's still something. If nothing else, make sure the ticket is properly categorized by component, milestone, etc. You'll probably need to create a Trac account (free and easy!) in order to make modifications
Look for duplicates. If two tickets describe the same problem, one of them can go.
If the ticket can be legitimately closed (fixed, no longer a problem, a duplicate, etc) make sure that happens. Hop on IRC or the mailing list and harrass people until it gets closed. It may be a little bit annoying, but it will get results.

Conclusion

I haven't done a Parrot4Newbies post in a while, and I know some people have been looking for ways to get involved. With 2.0 on the horizon testing, profiling, and ticket triaging are all great and incredibly pertinent ways to get involved. And more importantly then just being involved, these are all great ways to help Parrot grow and get ready for the big milestone. So if you are interested in Parrot and have a few spare moments, take a look at some tickets and see what you can accomplish. I can guarantee that anything you get done will be much appreciated.

Tuesday, November 17, 2009

Parrot Users Mailing List

Received a message from Coke this morning: We have a new mailing list setup specifically for users of Parrot and applications running on top of Parrot. If you would like a place to chat about Parrot without getting sucked into the minutia of the developers mailing list, the parrot-users list might be the thing to look at. Subscription to the list is free and easy.

It has also been suggested that all the parrot developers join that list to help answer questions. Hopefully it will be a great place for new Parrot users to go, get help, meet other developers, and get started using our software!

Saturday, November 14, 2009

Matrixy Passing (Almost) All Tests

I went on a bit of a marathon today, and I'm pleased to announce that the pla_integration branch of Matrixy is passing almost all tests. All the tests that it is failing are relying on Complex number handling, which Parrot-Linear-Algebra doesn't support yet.

This is pretty big vindication that what I have been trying to do with Parrot-Linear-Algebra is going well: Despite the project being relatively young, the various matrix types provided by that project are turning out to be very stable and robust. The NumMatrix2D type is the most used and the most feature-full right now, but relative newcomer CharMatrix2D is proving to be very powerful and useful as well.

PMCMatrix2D is mostly unused for now. However, I do intend to use that in the near future to implement Cell Arrays in Matrixy. This is a feature that we had no good way of implementing in the old Matrixy, but in the new system it looks like a very natural extension of the other things I've been doing.

With Cell Arrays, it should be possible, if not easy, to properly implement variadic input and output function arguments. This is a huge issue that's preventing Matrixy's library of builtin functions from being accurate to the behavior provided by Octave or MATLAB. It's also preventing us from writing more functions directly in M, instead of in PIR. Even if we have to descend into inline PIR for some things, it would be great if we could write more code in M.

Starting tomorrow I'm going to try and add some preliminary Complex number support to Parrot-Linear-Algebra, and try to get the remaining Matrixy tests passing with it. With that we'll be back to where we were before the projects were split, and new development can begin in earnest.

Friday, November 13, 2009

The Path to Matrixy

I've spent a little bit of time in the last week working on the pla_integration branch for Matrixy. The goal of that branch is to update Matrixy to use Parrot-Linear-Algebra as an external dependency, and to use its matrix types natively instead of the ad hoc home-brewed aggregate types I had been using previously. In short, I'm doing things the way they should have been done in the first place.

As of Sunday evening, the branch could build, run, and execute the test suite. There are several tests that are failing still (many of which are to be expected), but a good number that are successfully passing too, which is good.

String handling is one area where I was expecting some serious regressions, and was not surprised in that regard. M uses a very idiosyncratic handling as I've mentioned before, and all tests that are relying on complicated string behavior are failing miserably. In response to this, last night I created a new matrix type in the Parrot-Linear-Algebra project: a 2D character matrix type that will be used to implement Matrixy's string handling. My task now is to improve this new matrix type (including adding a test suite for it) and integrating it into Matrixy. I started some of that last night but haven't pushed any of my work to the public repo yet.

Another thing I added last night was the "matrix" role. Any of the matrix types in Parrot-Linear-Algebra will now respond positively to a "does 'matrix'" query. An object that fulfills the matrix role will have the following properties:

Contains a number of elements arranged in a rectangular NxM grid
Elements can be accessed linearly using a single integer index (behavior varies by type, but it is always possible)
Elements can be accessed by X,Y 2-ary coordinates to retrieve a value
Matrices can grow automatically to accommodate newly inserted values
Matrices should stringify in such a way where each row of the matrix will be on it's own line (rows separated by newlines). The formatting of each individual row is dependent on the type of the matrix.

With a nice standard interface like this, these matrix types should be consistently usable from HLLs, and I know Matrixy is going to be making aggressive use of these features to implement even it's most basic behaviors. I may try to add new requirements to this list, specifically there are some methods that I would like every matrix type to have (is_scalar(), is_empty(), etc.), and maybe a few other behaviors that should become standard between all our types. I'm starting to think that a templating system will become a necessity to prevent us from needing to rewrite similar algorithms for what could become dozens of matrix and vector types. The improved grammar support in NQP-RX may be the catalyst that makes these changes possible. It's another task for another day.

Speaking of tasks, on the near-term TODO list I plan to add specialized vector types to Parrot-Linear-Algebra, add tests for the new matrix and vector types, beef up the CharMatrix2D type to handle the string operations that Matrixy needs, and continue fixing the pla_integration branch for Matrixy to pass more of it's tests. On top of all that, there's a testing hackathon this weekend that I want to participate in, some work for Wittie that I need to finish, and possibly having a baby. Could turn out to be the very busy next few days for me!

Thursday, November 12, 2009

String Handling in M

I've been doing a lot of work recently on the pla_integration branch for Matrixy. The goals of this branch (which is likely to just become the new master) are to integrate Matrixy with the Parrot-Linear-Algebra project, and use it's new matrix types instead of home-brewing our own types.

I'm already seeing some good results: I've got lots of important tests passing and performance seems to be nice (though startup time and PCT processing time overall are worse, but that's another issue for another day). However, the one area where I am still having a lot of problems is in fixing the handling of strings.

Strings in M are very idiosyncratic. This is especially true when we start mixing strings with matrices. One good thing I am finding is that the various idiosyncracies and even--dare I say--inconsistencies help give a certain insight into the way Octave does it's parsing. We should be able to take those insights and try to produce a sane and compliant parser for Matrixy. Best way to proceed is through a series of examples. As a reminder about M, lines not terminated by a semicolon print their results to the screen, and the % sign is the comment delimiter. I'll show the output of each line in comments below it:


x = ["ok";"w00t"]
% ok
% w00t

Here is a very simple case. We have a matrix with two rows, each row contains a string. When printed, each row of the matrix is printed on a newline. One thing to notice and remember for later is that the strings on these two lines are not the same lengths.


x = ["ok"; 65, 66]
% ok
% AB

M has a lot of close ties to Fortran, which is what the original version of Matlab was developed in. In later times I believe it was ported to C and then to Java, taking characteristics of each implementation language along the way. In any case, there are some very obvious influences from Fortan and C on the M language. One such influence is that strings are simply arrays of characters. When we are printing out a matrix that has some kind of internal "I am a matrix of strings" flag set, integer values are converted to their ASCII equivalents and treated as characters.


x = ["ok"; 65, 66, 67]
% "Error: number of columns must match (3 != 2)"

Here is a slightly strange result. In the first example I showed, we can have a matrix where the string literals on each row are different lengths. In the second example I showed that we can treat integers as characters in forming string literals inside a matrix. However here we see a surprising result: If we try to make a row of all integers and a row that is a string, they must have the same length.


x = ["A", 66; "C", 67]
% AB
% CD

Mixing integers and strings on a single row works.


x = ["A", 66; "C", 68, 69]
% "Error: number of columns must match (3 != 2)"

...but the rows must be the same lengths when we build them like this.


x = ["ABC"; "D"; "E", 70]
% "Error: number of columns must match (1 != 3)

And we can see from this error message that suddenly line 2 ("D") throws an error because it's not the same length as line 1 ("ABC"), even though this would have worked if we hadn't included line 3 ("E", 70). As a more complicated example, and to clarify how strings of uneven lengths are stored, see this example:


x = ["ABCDE"; "F"];
x(2, 5) = "G";
x
% ABCDE
% F   G

So we can see from here that strings aren't inserted into matrices with arbitrary lengths, they are padded out to be the same length with spaces. Finally:


foo = "Awesome";
x = [foo; 65]
% "Error: number of columns must mach (1 != 7)

So we can see that the checks for these matrix sizings are happening at runtime, not at parse time. (This small example could be explained away by aggressive constant propagation in the optimizer, but I will assure the reader that this holds true in "real" cases as well).

We can divine a few parser rules from all this:

If we have strings in the matrix, we flag the matrix as being a stringified one and print it as a string. This means converting any integers in the matrix to characters in the row-string.
If we have integer or number literals in the matrix, even if they can be converted to ASCII characters, the rows of the matrix must have the same lengths.
Judging from the third-to-last example, they appear to do these length checks and string checks on the matrix after parsing is completed (otherwise, why would it have errored on lines 1 and 2 not being equal length when it didn't see the integer until line 3?).
These checks happen at runtime.

What I think we need to do for parsing these literals is the following:

We parse each row separately, and pass them to a row object constructor at runtime.
If the row contains any strings, set the "has strings" flag. If the row contains any numbers, set the "has numbers" flag.
We pass all row objects to a matrix constructor
If all rows are strings, and no rows have numbers, pad the strings with spaces and insert them into a character matrix (like a normal matrix, but defaults to printing like an array of ASCII characters). Done.
Check row lengths. If all are not the same at this point, throw an error. Done.
If any rows contain strings, create a new character matrix object and populate it with all data, no padding. Done.
If rows only contain numbers, create a normal NumMatrix2D PMC, populate it and return that. Done.

As an aside, there's another example that is worth showing:


x = ["ABC"; 68.1, 69.2, 70.3]
% "ABC"
% "DEF"
x(2, 2)
% ans = E

We see here that floating-point numbers are converted to ASCII integers when inserted into the character matrix, and that rounding sticks: you can't get the original non-integral value back after the conversion. So all my ideas above with the character matrix type should work with this.

So that's what I think we're going to have to do if we want to faithfully reproduce the behavior of Octave. This system will make matrix literals in the code a bit of a performance drain, but that's what we're going to have to live with for now.

Wednesday, November 11, 2009

November Testing Hackathon

A quick announcement before I forget about it:

This weekend, November 14th and 15th, there will be a testing hackathon for Parrot. We want to focus our efforts on improving tests, especially opcode-related tests in t/op/*. Some tasks that we will try to work on are:

Converting tests from Perl to PIR
Improving coverage of tests for ops
Get lots of platform test reports in anticipation of the 1.8.0 release.

I would love to see lots of people on IRC this weekend to help with the festivities.

Parrot's libJIT framebuilder

I offered a bit of a challenge to my readers a while back, and I mentioned that I had received one (very good) submission from plobsing. It used libJIT to implement a frame builder for Parrot, to replace the old home-brew one that we had been using.

I also hear tell that he's planning a framebuilder using libffi too, although I'll have to talk more about that in a later post.

There were some bumps in the road, however. Regular readers will recognize that between the time he sent me the code and now the big PCC refactor landed, which changed the landscape for anything involving function calls quite substantially. But plobsing persevered, and with some configuration help from our own kid51, we ended up with a very nice working branch running the new framebuilder.

Assuming things test well, I would like to push to get the libjit_framebuilder branch merged into trunk soon. We need testing so here is something you, the reader, can do:

Test the Branch

Checkout the libjit_framebuilder branch and give it a whirl. You want to do this before you have libJIT installed on your system to get a baseline. Parrot should work normally without libJIT installed.

Get libJIT

Getting libJIT is probably the hardest part of the whole process. There don't seem to be any debian packages for downloading, and there definitely don't seem to be any Windows binary installers floating around. Google returns few results when I search for "libjit", mostly Wikipedia and plobsing's own work (not usually a good sign!).

You can download the source to the mainline project HERE, and can apparently download source from a fork of it HERE as well. If you have SVN, you can get the source of the fork from it's googlecode project. I'm not really able to find a repo link for the project mainline. Maybe somebody else can help point me in the right direction for that.

From the source you can type this to get it to build on Linux:


./configure
make
sudo make install

On my machine, I also had to run "sudo ldconfig" to update some kind of cache or something. I don't know the details, but if you're on Linux and it doesn't work, try that.

I have no idea how to get this working on Windows. You may be SOL.

Test the Branch Again

Now that you have libJIT reconfigure, rebuild, and retest Parrot. It should detect libJIT, use it for the framebuilder, and (I hope) you should see some kind of performance improvement. At the very least, it shouldn't be noticeably slower.

Send Reports

If things work on your platform, let us know. If things don't work, we definitely need to know that as well. If you have any information about how to get or build libJIT on various systems, we would love to start compiling some documentation about that too. More information is always better, and if Parrot is going to use libJIT going forward (even as an optional component for some performance improvements) we should all be aware of how to get and use it.

Tuesday, November 10, 2009

Planning for Lorito

I had an interesting conversation with plobsing this morning about Lorito. He has been wrapping up his work on the libjit-based framebuilder, and is looking to break ground on another project. Specifically, he wanted to start putting together a prototype for Lorito.

I had, along with chromatic and some other people, been thinking about Lorito as a very very low-level language, similar to an assembly language. The idea being that each Lorito op would share a close correspondence with an operation in the JIT engine, and would therefore be trivially easy to compile at runtime.

Plobsing was taking a different approach: He's thinking about a higher-level structured language that we would parse down into native JIT format at build time. There are some merits to this idea and since he mentioned it to me I've been thinking about it more and more. Let's take a look at things in more detail, focusing on LLVM specifically.

First, LLVM uses it's own platform-neutral LLVM bytecode format natively. High-level code is parsed and compiled down to LLVM bytecode which can then be optionally optimized before conversion to machine code. That first part ("parsed and compiled") is expensive: We don't want to be parsing and compiling to LLVM bytecode at runtime, that would eat up any potential gains we could get from JIT in the first place. What we want is for the Lorito code to be compiled only once: during the Parrot build. From there we will have LLVM .bc files which contain the raw bytecode definitions for each op, which can be combined together into large methods or traces and compiled to machine code in one go.

Ops are defined in .ops files, which are currently written in a preprocessed C-like language but which will eventually be rewritten in Lorito. During the build, we want two things to happen: First, we want to compile the Lorito down into machine code, as we currently do with our ops, for interpretted cores (fast core, switch core, etc). Second, we want to compile down the ops into LLVM bytecode for use with the JIT at runtime. By comparison:

Interpretted: Good startup time, Reasonable execution time
JIT'd: Slower startup time, better execution time. Many tradeoffs available between the two (usually in terms of adding/removing optimization stages)

For long running programs, costs necessary to startup the JIT engine can be amortized over the entire execution time, which produces benefits overall.

But, I'm digressing. The goal of Lorito was to produce a language that would be trivially easy to JIT at runtime. I was assuming that what we needed was a very small language to perform the job. However, what's really the most trivial to JIT is LLVM's native bytecode format. If we generate that bytecode at compile time, the runtime costs will stay to a minimum. This means that we can have Lorito be any language of any complexity: The only requirements are that we be able to compile it down to LLVM bytecode, hopefully in a process that isn't overly complex or fraught with error possibilities. So long as the conversion only happens once at build time, it doesn't really matter how complicated it is.

Any parser for anything that's more complicated than assembly language will take development effort. The tradeoff is reduced development effort when we start rewriting the ops in Lorito, and increased inclination to write more things in Lorito than just ops. For instance, rewriting PMCs and their VTABLEs in Lorito means that we can start exposing those to the JIT as well. The more of Parrot that we expose to the JIT, the more gains there are to be had from it.

Assuming Lorito is going to now be a relatively high-level structured language as plobsing suggests, the question now is what should it look like? Should it be C, or like C? Should it be NQP, or like NQP? NQNQP?

As a thought experiment, consider the case where Lorito is C. In that case, our ops and VTABLEs are already all written in Lorito, and we already have a Lorito compiler available (LLVM). In this case, there are only a handful of steps to do before Parrot has a working (if naive) JIT:

Add a configure probe to detect LLVM
During the build, compile down ops to LLVM Bytecode, in addition to everything else we already do with them
Add a utility function to compile a stream of bytecode (such as an entire Sub) to machine code using the LLVM bytecode.
Add a way (such as in the Sub invoke VTABLE) to cache methods that have already been compiled, and add a switching mechanism to invoke that instead of the bytecode

There are a few more necessary details, of course, but if we take this approach Parrot isn't too far away from a working, proof-of-concept JIT.

I'm sure this idea won't jive well with chromatic's "welcome to the new millenium" general distaste for C. I'm not 100% sure about it all myself. However, it is interesting and worthy of some consideration. It certainly does seem like a path of least resistance

MediaWiki Book Designer

The other day I pushed a new repository to github. It is my visual book designer tool for MediaWiki that I've been developing on and off for years now. I wrote a post a while ago on my now-defunct Wikibooks blog about it, and things have progressed significantly since then.

I started working on this graphical book interface years ago in response to complaints I had heard from new Wikibookians. Making new books was too hard, because it had to be done page-at-a-time and the book structure (TOC, chapters, pages, subpages, etc) needed to be created manually using a complex system of relative links. On top of that, Wikitext really loses what little charm it has when you start doing complex structural, organizational, or navigational things with it, and it was another hurdle that prospective new authors couldn't always seem to clear. In short, it was hard to setup a new wiki-based ebook at all, and much much harder to do it correctly. This is the problem that I wanted to address.

The book designer took a number of forms before I settled on a graphical, outline-based approach. I had been doing work on it and tinkering in my spare time for a while before I was approached by Professor Kidd at Old Dominion University. She was looking to integrate MediaWiki and wiki-based e-books into her classroom, but was looking for an easier way to build books. Actually she was looking for a large number of usability enhancements, but book building was a big piece of it.

One thing lead to another, her team picked up a nice grant to pursue the use of wikis in the classroom, and I signed on to help with some of the necessary development. The book designer grew from a curious little toy that I had been playing with privately into a full-blown extension. Development isn't 100%, but I decided that now was a good time to open it up to the public and host it on github. So, that's what I did.

This development project is also going to yield a few other extensions and enhancements which I will try to host publicly in due time. I'll post updates about all that when things happen.

The book designer project "works", but it isn't very pretty in a lot of places. It consists of two major components: the PHP-based backend and the Javascript-based interface. They communicate together through a quick-and-dirty (and very very ugly) interchange format. The user builds the book outline using the provided javascript tools, and clicks the button to save it. The javascript builds up a textual description of the outline and posts it to the server where the PHP backend parses that and creates the necessary pages with a large amount of helpful boilerplate text. When the dust settles, a complete skeleton book exists that only needs content to be added by interested authors.

I don't know how much more work I am going to do on this tool for the duration of this project. I also don't know how much I will be working on it thereafter, especially considering some of the other things I have going on in the coming weeks. But, if other people are interested in this tool and want to get involved, I will do everything I can to support that.

Monday, November 9, 2009

Parrot Project Proliferation Part 2

It's part two of my series on Parrot-related projects. There are a handful of these cool new projects that I know about, but I would love to hear some ideas from readers too. We can't give too much free publicity to cool Parrot projects!

Kakapo

NQP is a thin subset of the Perl6 syntax that's used as part of PCT to help build compilers. It's a very nice little language that tends to be relatively close to the underlying PIR code, and it's a real big help when building compilers. Part of the charm of NQP is that it doesn't include a runtime library. It's a bare-bones language, and provides only the few features necessary to build compilers.

Sometimes people using NQP are interested in a runtime library anyway. Sometimes, people need more then what the bare NQP compiler provides. To this end the Kakapo project, started by Austin Hastings, aims to provide a runtime of interesting functions that the NQP coder can use to perform common tasks. There are helpful objects, methods, and subroutines that can be used to interact with Parrot in a more natural way then having to descend into streams of inlined PIR.

Kakapo is a really interesting project, because it serves a few purposes:

Offers to turn NQP from just "a part of PCT" into a full-fledged and fully-capable programming language for Parrot development.
Provides some interesting and effort-reducing utilities that can be used by compiler designers to get off to a quicker start
Provides those utilities in a way that can be used from PIR programs and programs written in other languages on Parrot

When you get a chance, you should check Kakapo out and give it a spin. It may turn out to make your Parrot development work much easier.

Blizkost

Blizkost, started by Johnathan Worthington, is a project that I never thought I would ever see: Perl5 on Parrot.

Let that sink in for a minute. Perl5 is so idiosyncratic that there only is, and can really only be, one implementation. The official Perl5 specification is whatever the official Perl5 executable happens to parse and execute. The solution to work around this single-implementation problem is to simply embed the Perl5 interpreter into Parrot, instead of trying to develop a new parser from the ground up.

This is what Blizkost does: It implements a handful of new PMC types that act as wrapper for the Perl5 interpreter object and the various Perl5 object types. These PMC types allow Parrot programs to call into Perl5 code, and receive returned results back. Current functionality is limited, but there are a handful of interested (and talented!) contributors. If you know a thing or two about Perl5 internals, I'm sure they could use some help.

Winxed

NotFound started a very interesting little project: a javascript-like language compiler named Winxed. The twist is that instead of using PCT like most other projects do, he hand-wrote the lexer, parser, and code generators in C++.

Technically the word "compiler" covers a wide range of utilities, but most casual coders would probably refer to it as a "translator" instead. Winxed takes the javascript-like language input and outputs plain-text PIR code which can be compiled and executed by Parrot.

Wednesday, November 4, 2009

Libraries And Extensions

Parrot is just a virtual machine, an engine that's intended to run other programs and facilitate inter-operation between them. It's a facilitator for other programs and compilers and technologies; a linchpin for a whole new software environment. Getting compilers to run on it and communicate with each other is one goal: Getting them to all work seamlessly with a large assortment of native libraries is another.

We don't just want Parrot to be a cool program, we want it to enable a whole ecosystem of cool languages, programs, and libraries. I write a program in Perl6, use your framework that you've written in Cardinal, include it in a PHP webpage, and everybody gets to use system libraries written in C. To do that, we need people to create cool compilers, write cool programs, and write wrappers around cool native libraries. Seriously, everything cool.

One big part of this puzzle is Plumage, Parrot's module ecosystem manager. If you haven't already, check it out. Once registered with Plumage, any Parrot user should be able to download any compiler or any library that targets Parrot with only a few simple commands. It's like CPAN for Perl5, except has the potential to get bigger. Much bigger. With Plumage in place, we can really start building up the addons: PIR code libraries, native code library wrappers, compilers, applications, etc. The possibilities are endless, and provide a huge number of openings for new Parrot developers to get started on.

Library Wishlist

What native libraries do you use? What libraries do you wish Parrot had easy wrappers for? The first thing we need is a list of libraries that people want to use, and possible some use-cases for them (if they are obscure or rare enough that our developers won't be familiar with them). Post suggestions here in the comments of this blog, or on the Wiki, or in Trac, or wherever.

Darbelo put together the DecNum library wrapper for his GSoC project. I've been (with lots of help!) trying to put together wrappers for the BLAS and LAPACK linear algebra libraries. Eventualy we will be able to easily install these things through Plumage, and then other projects will be able to use them as dependencies. These are just two examples, and are both math-centric, but are certainly not the only libraries that need some attention. So ideas and suggestions for new libraries to target would be a great help.

Library Wrappers

Are you familiar with a popular library? Better yet, are you familiar with APIs and internals? We could use your help writing a wrapper around that library for Parrot. The benefit of course is that once you write a wrapper for the library once, it will be usable from programs written in all languages on Parrot. People writing in Ruby, Perl6, Tcl, and even eventually Perl5 will be immediately able to use your work.

Writing a library wrapper is relatively easy and straight-forward, although there isn't a lot of good documentation about the process. Actually, that would be another cool thing for newbies to do: Take a look at the documentation we do have and help point out areas that are confusing or short on details. Tell us what's hard to understand and we can help improve things so other new users will be able to get started faster.

Pure-Parrot Module

Austin has been working on a new project called Kakapo that's a library of runtime helper function specifically for NQP. This is an example of another class of extension: a pure-parrot library module. Pure-Parrot modules are ones that are written in PIR, or another language that runs on top of Parrot. These modules have the benefit that they don't need any other compiled binaries besides Parrot, which is great for platform interoperability. Write once, use anywhere. This is the power of Parrot.

Parrot ships with a small set of libraries, although these are mostly utilities that Parrot itself needs for building or testing. There is plenty of room open for creating new libraries as well to do all sorts of things. Need some ideas? Look at CPAN. There are thousands of libraries there which could be rewritten in PIR or NQP, which would immediately open them up for use with other languages as well. Writing cool libraries like this is a great way to get started using Parrot, and a great way to contribute back to the rest of the community.

Conclusion

The Parrot ecosystem is growing at an incredible rate. New language compilers, libraries, projects, and applications are springing up all over the place that use the tools Parrot provides. It's a compelling platform, and is demonstrably useful for a number of different types of projects. If you're interested in getting started using Parrot, non-core projects like this are a great way to get acclimated.

If you start a cool new project, or if you join a preexisting one, please let me know so I can highlight it on my blog.

Blog Closed

Pages