Blog Closed

This blog has moved to Github. This page will not be updated and is not open for comments. Please go to the new site for updated content.

Wednesday, July 29, 2009

AIO: All the rage

Excitement about AIO has been steadily building recently, and a number of contributors have been talking about it and asking me questions about it. I personally have been blocking on the io_cleanups branch which I've been working on with Infinoid. However both he and I have been pretty busy lately with work and life, and haven't been able to lock that branch down as I would have liked.

The io_cleanups branch is a little bit poorly named. I had originally intended it to be a short, fast, cleanup branch to get a few things in line before AIO. But, we took the opportunity to try and add in a proper implementation of pipes too. I would rather have all the pieces present (Files, Pipes, Sockets) when we start AIO so I can remember to make things general enough, then have fewer things implemented but the whole subsystem is a little more clean.

Astute observers will notice that we've already had pipes, in a fashion. Previously we've been able to open pipes to child processes using the FileHandle PMC. However, this mechanism was never very good and was only unidirectional. So Infinoid went through and created a proper bi-directional pipes implementation, which we're working on and testing right now.

The last problem that we're wrestling with is that this new "proper" implementation is a little bit messy. Parrot's IO system was basically written with only FileHandles in mind, and Pipes are a little bit shoehorned in right now. So ironically the "io_cleanups" branch actually introduced a new feature that is a little messy and didn't actually perform many cleanups. As much as I would like to resolve this little irony, I would far prefer to get a working Pipes implementation merged into trunk now. The branch is already too long-lived as it stands, and I really want to just reach the nearest stable point and merge it into trunk so we can start planning out the next moves.

Japhb was talking about adding D-Bus support to Parrot by wrapping the D-Bus libraries in PIR. Unfortuately, this is going to require better support in Parrot for asynchronous events and asynchronous IO operations. So, this is a good motivator to get AIO working, and is also a good use-case to help drive AIO development.

So as soon as the pipes thing stabilizes in the branch, I want to lock it down and merge it in. Then, I want to get started on AIO shortly thereafter. There is enough interest and even demand building at this point that I don't want to delay any more then necessary.

Friday, July 24, 2009

Why IIS Has Poor Market Share

Well, this probably isn't the only reason why IIS has such poor market share, but it is certainly the reason why I won't be using IIS for my work.

The Setup

My computer at work was recently reformatted using a "Windows XP Pro SP2" installation CD. After the install, I updated (on recommendation from Microsoft's Windows Updater) to SP3. I did this immediately because of security concerns.

When I tried to install IIS to do some testing of websites and services that I've been developing, I was told that I needed to insert my "Windows XP Pro SP3" CD instead. I don't have an SP3 CD, just an SP2 one. Too bad, can't use that!

The Twist

So I do a Google search for "IIS", and then do a search for "Download IIS". Most of the results that pop up are forum and mailinglist posts from people having the same kind of problem that I am having. So I went to iis.net, a website that, I'm positive, must have a download link somewhere. So I click "Downloads", and can't find it. I click the link "Try IIS7", but that isn't it either. I search through the list of downloads for about 30 minutes before giving up in despair. If there is a download link to get IIS from this website, I can't find it. Well, I could find a trial download of Windows Server 2008 that came with IIS, but I'm not reformatting my computer again to install a trial version of the OS just so I can have the privilege of using IIS (and then being nagged forever about how I'm using a trial version and I need to purchase a legitimate license). There are some forms of obnoxiousness that I absolutely refuse to tolerate coming from my computer.

So I click on the "Chat live with a specialist" link, which insists that it's going to be a "live" person. A window opens up that says "Liveperson", and shows a picture of a live human female. I spend the next 10 minutes chatting with a bot. Actually, it could have been a person who was so heavily scripted and ignorant of their work that they failed the turing test completely. Here's an actual quote, to get a sense of what I mean, keep in mind that we are on Microsoft's iis.net website where every single page literally has a reference to ISS on it: "I see. To make sure we are on the same page, can you tell me what IIS is an acronym for?". "She" tells me that she can't be of any help (probably because "she" is just a bot) and gives me a phone number to call to get in touch with the sales manager. No thanks, I don't want to buy anything, especially when it's this much of a hassle.

The Punchline

So I'm not going to use IIS on my computer. I've given up on it. It's obviously too much of a hassle, and the benefit is only that I'll be able to develop and properly test IIS-based websites locally. When we start designing our next generation product line and developing our next generation platform, I'm going to suggest to my boss that we use a web server that we will be able to get easily and use for proper testing: Apache.

Monday, July 20, 2009

Parrot on Win64

For the past few weeks I've been idly trying to get Parrot building and running on 64-bit Windows. With the release quickly approaching and Christoph laying down a feature freeze, I've been doubling my efforts to get something working on this system. Astute observers will notice that there have been smolder reports coming from "MSWin32" on "amd64" processors, but this is an illusion. Parrot does indeed build and run on a 64-bit Windows platform, if you compile it into a 32-bit binary with a 32-bit compiler. However, this isn't exactly the same thing as having a 64-bit bird. So all the previous smolder reports we have from "Win64" or "Windows - amd64" are reports of 32-bit birds on those platforms. After a lot of effort I've managed to finally upload an actual smolder report of a 64-bit bird on Win64, and it's not pretty.

64-Bit Compiler

The first big problem with Win64 is that there aren't too many compilers there. MinGW, a port of GCC to Windows (and the compiler that ships with Strawberry Perl), doesn't have a 64-bit version on Windows available yet. At least, not a stable one. Consequently Strawberry Perl doesn't have a 64-bit distribution. I don't know if ActiveState offers a 64-bit Perl, but that still wouldn't have a compiler associated with it.

Microsoft offers it's 64-bit compiler for free on that platform, but it's hard to find any free alternatives to that. However, Microsoft doesn't offer a 64-bit version of visual studio yet (that I am aware of), so we're doing this all on the command line with the Microsoft C compiler cl.

Compiling with 64-bit cl.exe

I've found this incantation seems to get us through the compilation process and allows us to run the test suite without too much hackery:

perl Configure.pl --ccflags="-GS- -MD" --intval="long long" --opcode="long long"

At least, this incantation should work after r40176.

Win64 Problems

I mentioned earlier that one of the problems on Win64 is finding a compiler. There are some that cost $$$, but I'm poor and refuse to buy a compiler for this work. If somebody has access to a proprietary one that might perform better I would love to hear about it. However, the compiler isn't the only problem.

For backwards compatibility, the Microsoft C compiler on Win64 treats "int" and "long" values as being both 32-bit quantities. If you want access to a 64-bit integer type, you need to use "long long". While it shouldn't, Parrot currently expects the size of INTVAL to be the same as the size of a pointer. It's bad practice, but that's the way things are right now. So, that's why you see the bits --intval="long long" and --opcode="long long" in the Configure.pl command line up there: To make sure these things are both 64-bit quantities.

[Update 22 July 2009: particle pointed me to this article, which explains the 64-bit landscape better then I do. For reference, Win64 with MSVC is "LLP64", while GCC on it's various supported platforms (including, I hope, MinGW when a 64-bit port of that is finally released) is LP64 instead.]

Another problem is a major lack of important definitions of things that support 64-bit programming. For instance, MSVC doesn't seem to define the values LLONG_MAX and LLONG_MIN, even though the C89 standard specifies them. Instead, MSVC provides the idiosyncractic _I64_MAX and _I64_MIN macros. MSVC also doesn't provide a strtoll function, for converting an ASCII string into a long long int. It does have strtol, but since long is only 32 bit, this function only returns a 32-bit quantity. This is a huge problem inside IMCC, which uses strtol, because suddenly a 64-bit Parrot on a 64-bit platform can't handle 64-bit integer constants in PIR code. Bummer.

Parrot Has a Problem

Microsoft isn't the only guilty party in this situation, one major problem in this process is actualy Parrot's fault: Parrot assumes throughout the codebase that INTVALs and pointers are the same size, and it casts data from one to the other without considering possible truncation. It's not hugely prevalent, but it does happen in several systems and in ways that are not going to be easy to resolve. One place where it happens very frequently is in the get_pointer and set_pointer VTABLEs, which we can resolve by preventing these two from ever being used in a way that pointer values are accessible to PIR code.

A lot of code cleanup is going to have to happen in Parrot, some of it painful, in order to get Parrot properly building on more of these "exotic" systems like Win64. Luckily this is a problem that can be broken down into small chunks, and might be a perfect job for new Parrot hackers to play with. However, this problem will definitely not be resolved by the release tomorrow, and may not be resolved for several releases to come.

Conclusion

So that's the state of Parrot on Win64. The release tomorrow is going to be shipping with the very first Win64 entry in the PLATFORMS file, although a quick glance at it will show that it's not nearly the success story that other platforms are. 64-bit Ubuntu is an example of a platform that has great support, although that's because of the large number of Parrot developers that use it. Like other amd64 platforms, Win64 doesn't have JIT either, so that's not a mark against it. I would like to see support for Parrot on Win64 improved, but I'm not hopeful that it's going to happen any time soon.

Parrot4Newbies: Reporting Problems

In my past few Parrot4Newbies posts, I mentioned several times that new users should do tests and report problems. However, one helpful commenter pointed out to me that I didn't mention precisely how to make those reports. Today I'm going to talk about precisely that.

There are a few good ways to get in touch with the Parrot developers when you have a problem, or an issue, or a suggestion, or even random comments. There is the mailing list, the IRC chatroom, and the Issue Tracker. Each of these are useful for various things, and people from one may direct you to another if your message would be better there.

The Parrot Mailing List

The Parrot mailing list, parrot-dev AT parrot.org, is a great way to get in touch with the Parrot community as a whole with important issues. However, as with any mailing list, there will be a delay in receiving a response. I don't remember all the rules, but you might need to subscribe to the list before you can post to it. You will definitely want to subscribe if you want to read any responses.

When sending an email to the list make sure to use a descriptive subject line, like "test failure on amd64 in packfile.t" instead of something general like "test failure". Describe whatever issue you are seeing and definitely include any log files that help give more information.

Mailing lists are aren't always reliable, so if you send a message you may not receive a reply in a reasonable amount of time. This is just a fact of the medium and shouldn't be taken as any kind of insult or offense. If you're looking for more immediate responses, use the IRC chatroom. If you don't need an immediate response, but don't want your issue to be lost in the sands of the mailing list archive, use the issue tracker instead.

#parrot IRC Chatroom


The Parrot team maintains an IRC chatroom #parrot on the irc://irc.parrot.org network. If you've never used IRC before and don't know what this means, I'm sorry but it's beyond the scope of this post to teach you. However, you want to send me a quick comment or a personal message through some other venue, I would be happy to teach you how to access it.

The IRC chatroom is a great place to go to get immediate responses, but only a small subset of the Parrot team would be online at any given time. Some times of day there are very few developers on at all. If you find a problem that is very small, report it on the IRC chatroom and you may be lucky enough to get an immediate fix for it. If the people online at the time cannot help you immediately they will probably direct to to the issue tracker or to the mailing list in order to get a larger number of eyes looking at your report.

On IRC you can use the nopaste service to post logfiles and backtraces and programs and any other information you have. Just make sure you specify the IRC channel as "#parrot" so people in the chatroom can see it.

IRC is a great medium for immediate communications, and allows a higher bandwidth then either the issue tracker or the mailing list. However, there is a definite lack of persistence and if your issue isn't answered immediately you can't expect people will backlog and answer your question later. There are Logs for the IRC channel, but most conversations are too long, rambling, and interleaved for people to be able to follow after the fact without some specific direction. Again, it's just a fact of the medium. If you don't need an immediate response (or don't expect anybody to have one ready), it may be better to post your issue to the mailing list or the issue tracker instead. If you don't know where to post an issue, ask on IRC and somebody will tell you where to go.

IRC is probably the best place to go if you need help or have questions, because people there can usually answer your questions quickly or point you to the proper documentation. The mailing list may be better for more complicated questions, or questions that the people on IRC cannot answer.

Parrot Issue Tracker

If you have a bug or a report, the best place to go is probably the issue tracker. Creating a bug report on Trac automatically generates an email to the Parrot mailinglist too, so everybody will see it at least once. Plus, all open bugs will stay visible in the tracker until they are resolved.

This is a pretty good system although there are some logistical issues that can prevent older tickets from getting the attention they need. There are also problems where duplicate issues can be take up multiple active tickets, and information isn't always updated in tickets in a timely manner.

If you have an issue and really need it dealt with quickly, make sure to set the priorities appropriately and try to assign the ticket to a specific developer. Choosing the right developer to assign a ticket to can be tricky, so it's probably best to ask on IRC or the mailing list first to find out who is interested in the issue and is capable of fixing it. Keep in mind that developers are very busy people and all have long lists of other tasks that they are working on, so you can't expect overnight results.

To create and be able to edit tickets you will probably want to create an account on Trac. Creating an account is free, quick, and easy, and doesn't require too much information from you.

Conclusion

So there are the three ways to get in touch with the Parrot development team when you need help or have an issue to report. If people need any more details then I have provided here please ask and I will be happy to point you in the right directions.

Sunday, July 19, 2009

Rethinking Parrot Execution

I've been doing some thinking recently about Parrot's execution strategy. One of the reasons for this train of thought is because of some of the issues we've been seeing that relate to the inferior runloops problem.

The inferior runloops problem, so far as I understand it, is this: Parrot can create multiple runloops on the C system stack which execute code independently and can interfere with each other in strange and complicated ways. There are two main methods that we could use to go about resolving this: The first is that we painstakingly go around and properly encapsulate everything and implement all sorts of runtime checks to make sure various special cases don't happen and cause data corruption. The second is to prevent Parrot from recursing into multiple runloops at all (or, if necessary, do it very rarely).

The second case seems to be the obvious solution. "It causes problems? Then don't do it!". However, it's not so simple as that. The problem comes in many forms but specifically let's talk about PMC VTABLEs. PMCs can be overridden from PIR, so the VTABLE you are calling may turn out to be a PIR sub called inline from C without being able to return to the runloop first. So, there's no choice but to recurse into a new runloop. We could set up some elaborate system with longjmps to allow this kind of situation, but that would create new problems that we would rather avoid.

Of course this is only one case of a problem. There are plenty of situations where we can meaningfully avoid runloop recursion despite some cases where we cannot avoid them. One such place is in runtime code compilation.

Control flow in Parrot currently goes like this: We execute Parrot from the command line with the name of a PIR file to execute. Parrot passes the PIR file to IMCC which compiles it into bytecode. IMCC immediately executes any :init, :immediate, and :postcomp functions, then executes the :main function. The big problem with this is that IMCC is managing control flow for Parrot. When we do a runtime code compilation the same issue occurs. The executing PIR code creates the PIR compreg object and invokes it with a string of PIR code. IMCC takes that string, compiles it, and executes any :init and :immediate Subs in recursive runloops before returning a reference to the :main Sub to be executed in the parent runloop. This system provides needless complication and plenty of opportunities for spectacular failure. Let's look at a different way to do this.

We call Parrot with the name of a PIR file. Parrot passes the PIR file to IMCC to be compiled. IMCC compiles the PIR file and returns some kind of object. Parrot then passes that object into the concurrency scheduler that manages control flow from that point forward. That "some kind of object" couldn't just be a Sub PMC because it would need to contain arrays of all the :init and :immediate and :load Subs, a pointer to the :main Sub, etc. The scheduler will then fire off the subs from this execution object in the correct order, and do them all in the same runloop.

When we do a runtime compilation of PIR code using the PIR compreg object, the same mechanism would apply. IMCC compiles the code and returns the execution object. We pass that object to the schduler which stiches the :init and :immediate subs into the currently executing program flow and just executes them in the same runloop. A whole class of RT and Trac tickets will disappear overnight, the interface with IMCC gets better encapsulated, IMCC's internal logic becomes majorly simplified, we better integrate the scheduler with primary control flow, we get better control over where and when runloops are created, and we can reduce (though not eliminate) the need to recurse into new runloops in some situations.

Besides those problems that get fixed, we also open ourselves up to some cool new possibilities, such as being able to specify multiple threads that get launched automatically at runtime. or the ability for the scheduler to serialize the current control flow state and save it to a file for continuing later. I'm sure there are more things as well.

Sounds like a win-win-win-win-win-win-win to me.

Now obviously a lot of thought would need to go into this idea, and I'd love to hear any feedback that people have about it.

Saturday, July 18, 2009

Parrot 1.4: Feature Freeze

Christoph, release manager for 1.4, put out a note to the mailing list today about the release. Starting today there is a freeze on new features and he's asking for people to run lots of tests on multiple platforms.

The file in our repository, PLATFORMS, contains a list of the various OS/Compiler combinations where Parrot is known to build and perform. There are a small number of target platforms where we want Parrot to build primarily, and a larger number of other platforms where Parrot also builds. I'm sure this second list is incomplete (and can definitely be made larger with a few fortuitous patches). Getting Parrot to build and run on new platforms is a really big deal.

The 1.4 release is one of the special "deprecation point" releases. Developers of extensions or compilers on top of Parrot can be assured that major changes to the API will not happen before the next deprecation point. When we decide that something needs to be changed, we put a notice into DEPRECATED.pod and wait until after the next deprecation point so we can make the change. In other words, people working with Parrot can be assured that all releases between deprecation points will have a stable and well-defined API, and will have adequare forewarning about things that are going to change and when.

There are a lot of things in Parrot right now that have been deprecated and are slated to be removed or changed radically after 1.4. The remaining vestiges of the old stacks system are scheduled to be removed along with unused (and unmaintained) GC cores, get_addr and set_addr opcodes (which are a huge source of segfaults) and the old polymorphic inline cache system. This is just a small number of changes that are scheduled to be happening that I am particularly interested in.

Anyway, because 1.4 is a special release any any features we ship with it will be stuck there until the next deprecation point (2.0), it's super important that we get plenty of testing done now. We want 1.4 to be stable and usable, because users are going to be relying on it's functionality for the next few months. I cannot stress how important it is to do some testing this weekend, on as many systems as possible.

So fire up your console and make fulltest on a platform or two. Submit any bugs you find, and upload some reports to smolder too. Every little bit helps!

Friday, July 17, 2009

Parrot4Newbies: Platforms

The problem with Parrot building on so many platforms, or attempting to build on so many, is that the different platforms have different APIs and different capabilities that Parrot needs to be aware of. Several systems, especially IO and IPC, need platform-specific implementations, along with associated multi-platform testing. Simply put, we need lots of people to be looking at Parrot on lots of platforms. Here are some jobs that a newcomer to Parrot can get to work on, especially if you have access to a rare system.

Setup a Smoke Tester


I mentioned this in a previous post. Even though I don't want to repeat myself generally, setting up an automatic smoke tester is really a great and easy way to get involved in Parrot and to be a huge immediate help for the community. Since my last post went out we've seen a few new IRC bots showing up to inform us automatically about build and test failures, and we've seen a few new platforms showing up in the smoke test results. I can't overstate how valuable this is, seriously!

Interprocess Communication

Know anything about IPC? More importantly, know anything about IPC on your rare system? Parrot has basic functionality built in to spawn new processes, and our first real pipes implementation should be landing into trunk this week. However, a lot could benefit from people with solid know-how getting to work on some of the nitty-gritty issues. Parrot has several tickets dealing with IPC, some of them are among the oldest tickets in RT.

Ticket #31144 has to do with the arguments to our spawn and exec functions, which need to be cleaned up and parsed in a platform-independent way.

Ticket #36619 likewise involves the return values from these two operations, which should be returned as platform-independent values so Parrot can make use of them at a higher level.

Files and Filesystems

Parrot has a special OS PMC, which is a singleton type that allows some interaction with the operating system. The most important among these operations are file interaction operations. Unix people may know a utility called "touch", and Perl 5 people may be familiar with the utime builtin. Parrot doesn't really have a way to update file times like this, and you're able we would love for you to add one. Ticket #38145 discusses this particular need.

Likewise, Ticket #38146 discusses the creation of a file copying utility, although discussion there has since degraded into a general discussion about what's the best architecture to use for implementing these kinds of functions. Input on that discussion, or a solution to the problem, would both be appreciated (and moving the ticket from RT to Trac where it can be even more visible would be a big plus!)

Are you a Win32 person? There are lots of tickets around that deal with some of the idiosyncracies of that platform. Ticket #39853 specifically deals with the way forward slashes and backslashes are used in Windows, and how they should be standarized in file names.

Libraries and Dynamic Loading


Parrot aims to be very dynamic and pluggable, and a number of items from PMC types to opcodes to libraries can be loaded in to Parrot at runtime. But for all this to work, Parrot needs to have a robust mechanism for finding and loading libraries, including installed system libraries and local libraries.

Ticket #37258 involves problems loading libraries with periods in the filename. It's a practice that's far more common on Linux then in Windows, but important to get right in a platform-agnostic way.

Compilers

Any good with Compilers? Parrot builds really well with GCC, but when you start talking about other compilers the situation isn't so rosy. We always need lots of help (including fixing errors/warnings and setting up automatic smoke testers) getting Parrot to build with other compilers. Here are some of the ones that we need more testing and work on, specifically:
  • Microsoft C Compiler, 64bit systems
  • LLVM and Clang, 32- and 64-bit systems
  • Intel C Compiler, 32- and 64-bit versions
  • G++
Other compilers would be great too, but these are some of the big ones that we would love to get more help with. If you're good with your compiler, and it's on this list, we would love to hear from you!

Tuesday, July 14, 2009

Parrot4Newbies: Encapsulation

If you were anything like me, you probably rolled your eyes when your professors talked about all sorts of abstract concepts like "abstraction" and "encapsulation". At the time they were just words that didn't seem to have a lot of meaning; big, new vocabulary words that made the lectures more difficult to follow.

Well, now that I've worked on bigger projects then "Implement a merge sort algorithm on an array of 100 integer values in Java", bigger projects like Parrot, I have a better appreciation for what encapsulation is and why it's important.

There are many systems in Parrot which were written hastily and not always to the highest coding standards. I say that with full knowledge that often times concerns about performance and deadlines outweigh the need to produce beautiful and maintainable code. And in many cases a system is prototyped "quick and dirty" with the intent that it would be redone eventually, but immediately thereafter the coder moves on to different projects. This is a normal part of the development process (especially in the world of Perl!), but eventually is now.

Several of Parrot's subsystems are poorly encapsulated, if any attempt has been made to encapsulate them at all. For people who are decent with C, and are good at cleaning up code without having to make all sorts of functional changes, I have some jobs for you:

The Strings System

The strings subsystem is a big offender in terms of non-existant encapsulation. The "guts" of the STRING structure are poked at directly throughout the codebase, and many different locations throughout handle them differently. Parrot doesn't even really support read-only strings like it should right now, because there are too many places where strings are accessed to check for a read-only flag in all of them.

The string system needs to be encapsulated so we can make some much needed improvements to stability, performance, and capability in the future. Cleaning the API and abstracting the details behind a clean interface are the first steps in any future development efforts. The best part is that these fixes can be made incrementally, a perfect task for a new hacker.

Contexts

Contexts are very important things. They represent the current execution environment, containing the current register set, the set of scoped lexical variables, etc. Because of their importance as a central component in Parrot, and because the API is so woefully unencapsulated, any improvements to the Contexts subsystem are slow or even undoable.

And I've discussed on this very blog some of the big projects we have planned for Contexts.

Contexts are very central to Parrot, so working to clean and properly encapsulate them will take the intrepid coder on a whirlwind tour of the Parrot core internals: the calling conventions, JIT, runcores, lexical variables, exceptions, etc. It's a big job, and a great opportunity to explore the Parrot codebase and get your feet wet. It's also another task that can be done in small incremental steps.

The benefits however are many, and I'll be happy to talk at length about them to anybody who is interested.

GC String Allocator

I've got a dirty little secret to share: Parrot really has two separate GC cores. Or, two facets of a single core (the terminology doesn't really matter). The first, which people talk about the most, is the fixed-size header allocator. The second, which goes oft-unmentioned, is the string buffer allocator.

The GC system itself is pretty well encapsulated, but the individual internal components of it are not. Specifically, the string allocator relies on intimate internal knowledge of the fixed-size header allocator core, which means replacing one requires massive edits (if not complete replacement ) to the other. Separating the two out to use a cleaner interface and enabling one to be updated without affecting the other would be a major boon for Parrot. This one task would be a major help in writing new GC cores, which in turn would have a major performance boost for Parrot.

Conclusion

These aren't the only systems and subsystems in Parrot that need better encapsulation, almost all of the systems do. This is just a good representative set of such systems that are in the most dire need. Hackers new to the Parrot community should definitely try their hands and cleaning up interfaces to some of our systems, because a little bit of cleanup work can go a long way to helping Parrot grow.

Saturday, July 11, 2009

The Bugs

Here's an interesting backtrace that I've been looking at for a while:

#2  0x00007f3cbab0bd7c in Parrot_confess (
cond=0x7f3cbad4a0d8 "PObj_is_PMC_TEST(sig_pmc)",
file=0x7f3cbad4a040 "src/call/pcc.c", line=613) at src/exceptions.c:607
#3 0x00007f3cbab2541a in Parrot_init_arg_op (interp=0x946080, ctx=0x22f2040,
pc=0x7f3cbb1e2eb8, sti=0x7fffc31f5a80) at src/call/pcc.c:613
#4 0x00007f3cbab2a150 in set_retval_util (interp=0x946080,
sig=0x7f3cbad4a7fd "P", ctx=0x22f2040, st=0x7fffc31f5a80)
at src/call/pcc.c:1948
#5 0x00007f3cbab2a64e in set_retval (interp=0x946080, sig_ret=80,
ctx=0x22f2040) at src/call/pcc.c:1994
#6 0x00007f3cbab2f0e3 in Parrot_runops_fromc_args (interp=0x946080,
sub=0x7f3cb58636b0, sig=0x7f3cbad4d7a7 "P") at src/call/ops.c:340
#7 0x00007f3cbab7d093 in run_sub (interp=0x946080, sub_pmc=0x7f3cb58636b0)
at src/packfile.c:686
#8 0x00007f3cbab7d3e4 in do_1_sub_pragma (interp=0x946080,
sub_pmc=0x7f3cb58636b0, action=PBC_MAIN) at src/packfile.c:778
#9 0x00007f3cbab7d618 in do_sub_pragmas (interp=0x946080, self=0x268b370,
action=PBC_MAIN, eval_pmc=0x7f3cb5862780) at src/packfile.c:940
#10 0x00007f3cbab7d705 in PackFile_fixup_subs (interp=0x946080, what=PBC_MAIN,
eval=0x7f3cb5862780) at src/packfile.c:4912
#11 0x00007f3cbad3398a in imcc_compile (interp=0x946080,
s=0x3b6fab0 "\n.HLL \"perl6\"\n\n.namespace [\"Test\"]\n.sub \"_block1830\" :subid(\"264_1247316154\")\n.annotate \"line\", 0\n .const 'Sub' $P1833 = \"265_1247316154\" \n capture_lex $P1833\n.annotate 'file', 't/spec/S14-role"..., pasm_file=0, error_message=0x7fffc31f5f60) at compilers/imcc/parser_util.c:738
#12 0x00007f3cbad33a60 in imcc_compile_pir_ex (interp=0x946080,
s=0x3b6fab0 "\n.HLL \"perl6\"\n\n.namespace [\"Test\"]\n.sub \"_block1830\" :subid(\"264_1247316154\")\n.annotate \"line\", 0\n .const 'Sub' $P1833 = \"265_1247316154\" \n capture_lex $P1833\n.annotate 'file', 't/spec/S14-role"...)
at compilers/imcc/parser_util.c:876
#13 0x00007f3cbab51e78 in pcf_P_Jt (interp=0x946080, self=0xa32dd0)
at src/nci.c:237
#14 0x00007f3cbac4de98 in Parrot_NCI_invoke (interp=0x946080, pmc=0xa32dd0,
next=0x7f3cbb1e2eb0) at src/pmc/nci.c:244
#15 0x00007f3cbaa9627c in Parrot_invokecc_p (cur_opcode=0x7f3cbb1e2ea0,
interp=0x946080) at src/ops/core_ops.c:18185
#16 0x00007f3cbab882a8 in runops_slow_core (interp=0x946080, pc=0x7f3cbb1e2ea0)
at src/runcore/cores.c:462
#17 0x00007f3cbab86f7c in runops_int (interp=0x946080, offset=287157)
at src/runcore/main.c:987
#18 0x00007f3cbab2d8cc in runops (interp=0x946080, offs=287157)
at src/call/ops.c:119
#19 0x00007f3cbab2dce9 in runops_args (interp=0x946080, sub=0x7f3cb69c4de0,
obj=0x9c7770, meth_unused=0x0, sig=0x7f3cbad48ba3 "vP", ap=0x7fffc31f62f0)
at src/call/ops.c:269
#20 0x00007f3cbab2f0bc in Parrot_runops_fromc_args (interp=0x946080,
sub=0x7f3cb69c4de0, sig=0x7f3cbad48ba3 "vP") at src/call/ops.c:338
#21 0x00007f3cbab08e31 in Parrot_runcode (interp=0x946080, argc=2,
argv=0x7fffc31f65f0) at src/embed.c:1021
#22 0x00007f3cbad1cb2d in imcc_run_pbc (interp=0x946080, obj_file=0,
output_file=0x0, argc=2, argv=0x7fffc31f65f0) at compilers/imcc/main.c:801
#23 0x00007f3cbad1d791 in imcc_run (interp=0x946080,
sourcefile=0x7fffc31f76ba "perl6.pbc", argc=2, argv=0x7fffc31f65f0)
at compilers/imcc/main.c:1092
#24 0x0000000000400bc4 in main (argc=2, argv=0x7fffc31f65f0) at src/main.c:60

What this is, is a back trace from a test failure in Rakudo. Actually, this same back trace occurs for two separate test failures, which makes it particularly interesting. The two are almost identical except for the values of most function arguments.

So what is going on here? We see the function Parrot_invokecc_p which we can recognize as the function that implements the invokecc opcode. This in turn calls Parrot_NCI_invoke, which is the name of the generated VTABLE "invoke" function for the NCI PMC type. This calls the NCI thunk pcf_P_Jt, which is a generated call frame for the "P_Jt" call signature. It's a function that takes a reference to the interpreter (J) and a C string (t) and returns a PMC (P). This thunk calls the function imcc_compile_pir_ex with a long string of PIR code. In short, a runtime eval of PIR code.

So I was banging my head against this issue for a while, when pmichaud was able to narrow this down to a pure-PIR snippet. Here's his code and backtrace:

$ cat x.pir
.sub main
$S0 = <<'END'
.sub 'abc' :load :init
say 'run abc'
.end

.sub 'def' :load :init
die 'died in def'
.end

.sub 'ghi' :load :init
say 'run ghi'
.end

.sub 'main' :main
say 'run main'
.end
END

$P0 = compreg 'PIR'

push_eh trap1
$P0($S0)
trap1:
pop_eh

debug 0

push_eh trap2
$P0($S0)
trap2:
pop_eh
.end

$ gdb ./parrot
(gdb) b Parrot_debug_ic
(gdb) run x.pir

Breakpoint 1, Parrot_debug_ic (cur_opcode=0x916b9a8, interp=0x90c2040) at src/ops/core.ops:1001
1001 if ($1 != 0) { Interp_debug_SET(interp, $1); }
(gdb) bt
#0 Parrot_debug_ic (cur_opcode=0x916b9a8, interp=0x90c2040) at src/ops/core.ops:1001
#1 0xb7dc3410 in runops_slow_core (interp=0x90c2040, pc=0x916b9a8) at src/runcore/cores.c:462
#2 0xb7dc200e in runops_int (interp=0x90c2040, offset=5) at src/runcore/main.c:987
#3 0xb7d9b9e5 in runops (interp=0x90c2040, offs=5) at src/call/ops.c:119
#4 0xb7d9be23 in runops_args (interp=0x90c2040, sub=0x9136388, obj=0x9124f70, meth_unused=0x0, sig=0xb7fe53b3 "P", ap=0xbfe3d52c "����(\031\002�\201�η")
at src/call/ops.c:269
#5 0xb7d9cce6 in Parrot_runops_fromc_args (interp=0x90c2040, sub=0x9136388, sig=0xb7fe53b3 "P") at src/call/ops.c:338
#6 0xb7db76d9 in run_sub (interp=0x90c2040, sub_pmc=0x9136388) at src/packfile.c:686
#7 0xb7db79cb in do_1_sub_pragma (interp=0x90c2040, sub_pmc=0x9136388, action=PBC_MAIN) at src/packfile.c:778
#8 0xb7db7bd7 in do_sub_pragmas (interp=0x90c2040, self=0x916a098, action=PBC_MAIN, eval_pmc=0x9136340) at src/packfile.c:940
#9 0xb7db7cc3 in PackFile_fixup_subs (interp=0x90c2040, what=PBC_MAIN, eval=0x9136340) at src/packfile.c:4912
#10 0xb7fcbc65 in imcc_compile (interp=0x90c2040,
s=0x916a2b8 ..., pasm_file=0, error_message=0xbfe3d6fc)
at compilers/imcc/parser_util.c:738
#11 0xb7fcbd4e in imcc_compile_pir_ex (interp=0x90c2040,
s=0x916a2b8 ...) at compilers/imcc/parser_util.c:876
#12 0x0915be25 in ?? ()
#13 0xb7ef328c in Parrot_NCI_invoke (interp=0x90c2040, pmc=0x91364d8, next=0x916b9a4) at ./src/pmc/nci.pmc:326
#14 0xb7d14f76 in Parrot_invokecc_p (cur_opcode=0x916b99c, interp=0x90c2040) at src/ops/core.ops:504
#15 0xb7dc3410 in runops_slow_core (interp=0x90c2040, pc=0x916b99c) at src/runcore/cores.c:462
#16 0xb7dc200e in runops_int (interp=0x90c2040, offset=0) at src/runcore/main.c:987
#17 0xb7d9b9e5 in runops (interp=0x90c2040, offs=0) at src/call/ops.c:119
#18 0xb7d9be23 in runops_args (interp=0x90c2040, sub=0x9136418, obj=0x9124f70, meth_unused=0x0, sig=0xb7fe0d17 "vP", ap=0xbfe3d97c "") at src/call/ops.c:269
#19 0xb7d9cce6 in Parrot_runops_fromc_args (interp=0x90c2040, sub=0x9136418, sig=0xb7fe0d17 "vP") at src/call/ops.c:338
#20 0xb7d79627 in Parrot_runcode (interp=0x90c2040, argc=1, argv=0xbfe3daf8) at src/embed.c:1021
#21 0xb7fb548f in imcc_run_pbc (interp=0x90c2040, obj_file=0, output_file=0x0, argc=1, argv=0xbfe3daf8) at compilers/imcc/main.c:801
#22 0xb7fb608c in imcc_run (interp=0x90c2040, sourcefile=0xbfe3f63d "x.pir", argc=1, argv=0xbfe3daf8) at compilers/imcc/main.c:1092
#23 0x08048978 in main (argc=1, argv=0xbfe3daf8) at src/main.c:60
(gdb)


See the problem now? What's happening in this little snippet is that we call IMCC to compile a string of code for us into a new Sub PMC. During compilation we have an :init sub, which gets executed immediately in a new runloop. That's why you see two instances of runops_slow_core on the backtrace, the second is the new runcore created inside IMCC to execute the :init sub. That executing sub throws an exception, so Parrot's exception system looks for a handler and finds one.

The problem comes because the handler that we find exists in a different execution context then where control flow is currently. The inner runloop jumps control flow to the handler, and continues executing like normal until the end of the program (or, in this last case, until the breakpoint). Once that runloop gets to the end of the program it exits, and continues control flow from the point inside IMCC where the :init was first executed. This jumps back up to the outer runloop, which attempts to continue it's operations but now with a corrupted interpreter.

This mimics several other problems we've seen in other tickets, where throwing an exception from inside a child runloop creates strange problems in the parent runloop. For a long time we didn't really understand why it happened. Now, thanks to pmichaud's small test case, I think we know exactly why it's happening and are even starting to formulate a few ideas for fixes. Look to the list for more discussion on this issue.

Friday, July 10, 2009

Python, Trac, and Agile

Haven't blogged in a few days, and it's not because I've been burned out from my blogging sprint two weeks ago like some people suggested. I've been under the weather and have had a lot of other things going on that required my attention. I actually have drafted several posts, just haven't published any of them for a variety of reasons. I have been getting good feedback still from several of my older posts, so that's nice to see.

At work we've been trying to become more Agile, and my boss has started using the curious words "agile" and "scrum" more frequently. We've had a very small team here for a while and our development processes have been very ad hoc and personalized. When your "team" is one person, and each "department" is one or two people, it's hard to say we need to implement particular methodologies to help promote better team development. Of course, as we grow the need for more standardized methodologies is becoming clear.

We set up a development server with SVN, Trac, and MediaWiki for our team to use. We had been using DynamSoft's Visual Source Anywhere Hosted service for our source control, which worked reasonably well when our team was smaller. However, as we grew larger and as our network infrastructure became more capable, it was less and less of an attractive option then running our own SVN repository on one of our development servers. We were also using DynamSoft's Issue Tracking Anywhere, which I never liked personally and found to be far less helpful to us then a free alternative like Trac or Bugzilla would be.

So I created installed SVN and created the repository (easy), installed XAMPP (easy) and MediaWiki (very easy). Then, it was time to install Trac. Saying that the installation (and ongoing configuration and maintenance) was difficult is an understatement. Welcome to dependency hell. Lasciate ogni speranza voi chi entrate.

XAMPP installed Apache 2.2, and I picked up the "latest" SVN (which is what the Trac documentation specified), which happens to be 1.6.3. I'm running this whole monstrous software stack on Windows Server. And don't tell me to use Linux instead, that isn't an acceptable or available solution in this case. I first tried installing everything on the existing IIS server, but that was even worse of a nightmare. So, I have made some concessions by running Apache instead (although the IT guys aren't thrilled about having one special-case Apache server in a whole network of all IIS servers, but they are going to deal with it).

Between the core Python distribution (2.5), modpython (3.3.1), the python SVN bindings, and Trac, there are very very few setups that I can use to ensure compatibility. Combine this with the fact that I have the newest SVN (1.6.3 at the time) and the newest Apache (2.2), and there are very few configurations that seem to work. Following the installation instructions for Trac alone give me huge headaches because the versions specified in the installation instructions don't work together, and that's straight out of the Trac documentation.

I've been able to get Trac o.11 installed and working but I have not been able to install any of the plugins we are interested in. Apparently most of the newest plugins expect Trac 0.11 to have Genshi 0.6, but my installation manages only to have Genshi 0.5.2. I tried to update, but Genshi doesn't appear to be compatible with Python 2.5 or something, and that failed. Too many error messages that I've seen over the course of a week of trying now, I can't remember what they all are specifically.

We already have Trac installed and in use by our team, so I'm not switching now to use Bugzilla or something different. However, as my first real experience with the Python ecosystem I am severely unimpressed and even disheartened. It is far more difficult getting these kinds of things installed then it should be. I don't deny that I've been a bit spoiled by Perls CPAN, and Ubuntus apt-get repository system, things that just work when I want them to.

The documentation I have seen (when any even exists) and the "help" I've been able to get from people is sub-par at best. Maybe that's the norm in the Python world or the open source world in general, and that's very upsetting to think about. It is all the more reason why I believe so strongly in the Parrot concepts of interoperability and platform abstraction.

Thursday, July 2, 2009

Writing PMCs in NQP

So the question has arisen lately, what is L1 going to look like, and how hard is it going to be to write ops and PMCs in it? The answer is that we aren't going to be writing in L1 directly, we have PCT and will be writing it in a higher level language and compiling it down directly to Parrot bytecode. Here is an example of what the FixedPMCArray PMC type will look like rewritten in NQP:

class FixedPMCArray :need_ext :provides('array') {
has $.size as int;
has @.pmc_array as pmc;

vtable elements() as int {
return $this.size;
}

vtable destroy() {
if $this.pmc_array != null
Parrot::mem_sys_free($this.pmc_array);
}

vtable get_integer() as int {
return $this.elements();
}

vtable get_bool() as bool {
return $this.elements() != 0;
}

vtable get_integer_keyed_int($idx as int) as int {
my $intval as int = +( $this.pmc_array[$idx] );
return $intval;
}

vtable set_integer_keyed_int($idx as int, $val as pmc) {
if $this.elements < $idx
$this.set_integer_native($idx);
$this.pmc_array[$idx] = $val;
}

vtable get_string_keyed_int($idx as int) as str {
my $strval as str = ~( $this.pmc_array[$idx] );
return $strval;
}

vtable mark() {
for(my $i = 0; $i < $this.size; $i++) {
my $pmc = $this.pmc_array[$i];
if !Parrot::PMC_IS_NULL($pmc)
Parrot::Parrot_gc_mark_PObj_alive($INTERP, $pmc);
}
}

vtable set_integer_native($size as int) {
if $this.size >= $size
return;

my $pmc_size = Parrot::sizeof_pmc_ptr();
my @new_pmc_array = Parrot::mem_sys_allocate($size * $pmc_size);
loop (my $i = 0; $i < $this.size; i++) {
@new_pmc_array[$i] = $this.pmc_array[$i];
}
$this.size = $size;
$this.pmc_array = new_pmc_array;
}
}


There are a few points to note here: First, I know this isn't perfect Perl 6 and I'm sure I screwed up some syntax here and there. I apologize for that, but I'm not really interested in going back line-by-line to fix it. This is just a thought experiment after all, and the important point isn't getting the syntax correct but instead proving the efficacy of this method. Second, I'm treating NQP here as just a particular syntax over very low-level semantics. The code has @.pmc_array as pmc is going to be equivalent to the C-ish code ATTR PMC** pmc_array. That is, we just assume that everywhere we see as pmc, that will become the equivalent C code PMC* and the @ sigil just adds another * to it.

One more thing worth noting is that I am assuming the management of the ATTR structure will be automated. The PMC compiler will recognize that this PMC type has attributes and will automatically allocate them on initialization and automatically deallocate them on destruction.

Since NQP is going to be compiled down into a very low-level bytecode that should be capably equivalent to C code, it is going to have direct access to C functions in libparrot. It will not be calling functions through NCI, it will be constructing machine-level call frames and executing functions directly. I show this using the syntax Parrot::FUNCNAME. The $INTERP contant is a reference to the current interpreter. This helps to differentiate functions which must be called with C semantics (pushing arguments onto the system stack) and those functions which can be called with L1 semantics instead (and I'm not entirely sure what those will look like anyway, but they won't be stack-based you can be sure of that). Instead of writing Parrot::, we could easily write C:: instead

So that's a quick look at what a basic core PMC could look like in NQP. If we all remember that in this particular case the NQP is going to be compiled down to low-level code and not into higher-level PIR/PASM, this all starts to make a lot more sense. Think of it like writing C but with slightly different syntax (and different underlying semantics), and without any of the high-level features that you would expect from Perl6.