Computers Archives

Sep 11

Upgrading MythTV to run on Fedora 7 with kernel 2.6.22

If you use MythTV you probably are aware by now that Zap2It labs has stopped providing free listing data as of 2007-09-01. There is an alternative though--schedulesdirect; a for fee service that is a drop in replacement for Zap2It labs ... except it costs $5.00/mo and you have to upgrade MythTV to 0.20.2

I was running MythTV 0.20 on FC4, it worked pretty well, excepting the occasional crash. I was planning on just upgrading to the new version of MythTV and leaving the OS at the current version. That is, that was the plan until I found out that atrpms.net no longer maintains packages for FC4. So I decided if I was going to have to compile source might as well update everything.

That's when the adventure began. It turns out that things have changed a lot since FC4; especially with the IVTV driver (the driver that is used for the PVR-350 capture card I use). The driver has been integrated into the kernel as of 2.6.22. This is a good thing as it means the drivers have become much more stable but the documentation is still pretty lousy. Here's the outline of what I've had to do to get my box back and running... hopefully it will save you sometime:

So what exactly didn't work when I upgraded:


  • The video capture card was not automagically detected like I thought it should.

  • Lots of problems getting MythTV to use the PVR-350 framebuffer for display... this required some code changes

  • LiRC does not start properly (this is not a new problem... I also ran into on FC4 also)

How I fixed it all

I usually follow Jarod Wilson's "Fedora Mythtvology" guide when installing mythtv. The current version as of September 2007 is for Fedora 6. For the most part this will work fine for Fedora 7 as well. There's just a few quircks you have to be aware of due to the 2.6.22 kernel and ivtv.

IVTV was integrated into the kernel as of 2.6.22 (so is ALSA, so you can skip the sound portion of the tutorial). When you get to the ivtv step do the following:

yum -y install ivtv ivtv-kmdl-$KVER ivtv-firmware

Next add the following 3 lines to /etc/modprobe.conf

alias char-major-81 videodev
alias char-major-81-0 ivtv
install ivtv /sbin/modprobe --ignore-install ivtv; /sbin/modprobe ivtv-fb

That should do it on the ivtv, on your next reboot the /dev/video0 device should be working.

Next, you will need to install the X-Driver. Since Fedora 7 uses X11 release 7 you will need a new x driver available here.

Next install the xorg-server modules with:

yum -y install xorg-server

Now compile, like so:

cd ivtvdev directory
./configure --prefix=/usr
make
make install

Next edit the xorg.conf file in /etc/X11 like jarod describes, just change the driver name from ivtvdev to ivtv.
Now when you reboot X should load up on the framebuffer.

Now that you should be able to get video and load X on the framebuffer let's fix mythtv so it can show video on the framebuffer (in my experiance this looks much better than just using a video card). You have to patch the sources to include the controls for the new ivtv-fb commands as documented in 3486.

Your options include simply building trunk which depending on the day, you never know exactly what you are going to get. Or you could back port the patch for 0.20.2. Fortunately I have already done that for you. You can download the SRPM here. This is basically a patched version of MythTV-0.20.2-165 from atrpms.net

You should be up and running by now after one more reboot. Just setup mythtv to use the PVR-350 output device and your set.

NOTE: If lircd does not start up properly on boot you might need to modify the /etc/init.d/lircd script to try and reload the lirc_i2c module if it failed: Include this after the "start() {" line

if [ `/sbin/lsmod | grep lirc` != ""]; then
echo -n $"Reloading lirc_i2c: "
/sbin/modprobe lirc_i2c
RETVAL=$?
echo
fi

Jun 19

My new T60p

UPDATE
I have received the replacement T60p from Buy.com, it has the LG screen, NMB keyboard, and Intel wireless card. The LG screen is noticeably better than the samsung screen.. the colors are more vibrant, the horizontal lines that the showed up on the upper part of the screen on the samsung are not there on the LG, and there is less light leakage. While the LG screen is far better than the samsung it is still somewhat sparkly/grainy (oh, well I guess this is just the state of current widescreen technology).

Also it seems that even though the Atheros card, which was in the original laptop, is supposed to be better than the Intel ABG card, which is in the new one, my new machine does not have the wireless problems that the other one had. Hurray!
--------
ORIGINAL POST

Well my new T60p has arrived. It came last week and overall I am very pleased with it. It is very fast and responsive (there's a lot of people out there bashing windows vista but I happen to love it and find it kind of hard to use windows XP now). The keyboard is also VERY nice and the machine works very well in general.

Here's my complaints though:

1) The screen is noticeably grainy/ What I mean by this is there is something of a sparkle to the display, this is common on LCD's but it is driving me nuts. It feels worse than normal now, it's like I'm looking at the laptop through a screen door. I don't know maybe I am OCD because I never noticed it on my screen at work and now that one is bothering me too. I have sent the laptop back because I got the samsung screen which is supposed to suffer from this more than the LG so I am going to pay the extra $$$ and try and get the LG screen. I hope that the screen sparkle on the LG screens is less but I really like the Thinkpad T60's for the most part.

2) The wireless works well most of the time but occasionally it will drop out for no apparent reason and you will have to reboot the machine to get it back.

3) Windows Vista overall is very good but it's really annoying with all the things that it wants you to confirm... blegh, I just want it to do what I tell it too. I realize that they are trying to make it easier to notice if a virus or spyware is operating on your machine but it's really annoying and I think there might be better ways to accomplish the same thing (although I couldn't tell you any if you were to ask me.)

WARNING, WHAT FOLLOWS IS A DRM SOAP BOX --

It's also somewhat insulting that they have added all these tools to help you prevent viruses and spyware while adding a fairamount of DRM which in my opinion is just as bad as spyware. DRM stands for Digital Rights Managment and Microsoft insists that it is good for the end user because it allows them to make more content available on the PC; but, I don't really buy this. That content that they are so concerned about being stolen will be made available without the DRM if there is a big enough market. The companies are profit driven and therefore if consumers demand that the content be available on the PC then it will be, otherwise it is not valuable. Adding DRM is just a cop out because the providers want it. The end user was nowhere in the equation when Microsoft added this.

Just as an example of how DRM hurts the consumer, I purchased a small number of songs from Walmart's online download service. It was 88 cents a song plus tax (so about 94 cents total). The songs sounded fine quality wise and you had permission to burn them to CD up to 10 times. They also have a policy of not reissuing licenses. Well this is what bit me. I have the song files backed up but the licenses were not so now on my brand new computer I cannot play the music that I legally paid for only a few months ago. Walmart has a record of these purchases, I can see them on their website, however they will not reissue the licenses. DRM hurts consumers. In fact it punishes people for obeying the law (which is presumably what the record companies want). If I had illegally downloaded those songs I would have had no trouble transferring them to my new computer, but because I chose to legally download them and pay for them I am now stuck with unplayable songs and must purchase them again, a second time.

END RANT

May 25

New Computer

I know it has been quite sometime since my last post. If there's anyone out there still regularly checking this blog.. sorry.

Anyway I have been in the market for a new computer. My old thinkpad finally bit the dust last november (at least the screen did) so I have been limping along with a crappy toshiba for the last 6 months. I was waiting for the laptops that featured Santa Rosa to come out but around the middle of april I decided that Santa Rosa wasn't worth waiting for and started seriously shopping for a new laptop.

I began the search thinking I would not purchase another thinkpad as I was somewhat leary about them since they were sold to Lenovo. The major other computer I was looking at was a MacBook Pro. Well long story short I ended up back in the thinkpad's arms and settled on a Thinkpad T60.

I ordered onsale at lenovo.com and expected it to take several weeks. It took just about 1/2 a month to get here, arriving on May 2nd. I opened it up and immediately started hacking away at it... it was great! I was very happy with my purchase. Up until this point I had been only using it with AC power. Well this is where my problems began. The machine would turn itself off every 2 or 3 minutes when it wasn't on AC (yes, the battery had been charged for a full 24 hours and was showing 100%). I thought upgrading the BIOS might fix this, it didn't. The problem progressed to the screen not turning on at all. I called Lenovo and asked to return the machine. They were friendly and issued an RMA for the old machine and helped me get a new order in (with slightly upgraded specs) for a replacement. They also assured me that I was on their critical list seeing as I had already been waiting 2 weeks+ for my laptop.

That was back at the beginning of may. It is now almost the end of may and still no laptop. Oh well, it will get here when it gets here, I'm getting impatient though. On the upside lenovo has issued me a 10% discount in addition to my already discounted machine for having to wait so long. All in all I am getting this thing for about 50% less than what it usually costs!

Read the extended entry to see the specs of the latpop. I'll post an update with how i like the computer after I receive it.

Continue reading "New Computer" »

Feb 08

Learning to debug problems

I am a computer programmer and as a result I have to be pretty good at tracking down problems in things and successfully debugging them. When debugging it's imperative that a concrete methodology is applied. Here are some tips for debugging (these can apply to more than just computer programs!)

1) Assume the problem lies in something you have changed.
I find a good first assumption that you created the problem if you are debugging something you have written. It's not likely that the operating system or the tools that have been used by many other people have the most obvious flaws. It's much more common for errors to be in your newly developed code. It's also much easier to find flaws in code you wrote than in the millions of lines of code the comprise the tools programmers use every day.

2) Start simple and work your way up from there.
It's an uncertain world and there could be any number of things wrong when you are debugging a problem but everyone has to start somewhere and it's the simplest things that give us the most progress.

3) Debugging is always a process of elimination
My third rule of thumb is that debugging is never readily obvious. You always have to start with an assumption, make a hypothesis and prove it right or wrong. Then repeat until you actually find the problem.

These are simple rules but they will take you far down the path of debugging and it's amazing how many people I know that don't apply them!

Feb 06

Visually navigating information with Grokker

My roommate told me about a new search service calle Grokker that allows you to visually browse the information. I didn't quite understand what he meant when he was telling me about it but I checked it out and it's actually really cool. It's basically a search engine that automatically categorizes information in groups and allows you to visually navigate the "map." I could see something like this being really useful for research. It's always interesting to see new ways that people are visualizing data. I believe over the next 10 years visualization will be some of the biggest challenges that we will face in the computing and scientific communities.

Try it out at: www.grokker.com

Make sure you click on the map tab when the search results come up.

Apr 26

University of San Diego Today I took a quick trip out to San Diego to meet some guys we work with at the university.  The first part of the day was mostly finances, etc ... not necessarily my cup of tea.  But then later on the day we got to take a look at some of te things they are doing and it was really cool.  First they showed us a display wall that could show images in steroe and could be controlled by movement when you were some special glasses.  The applcation they showed us was exploring protiens, you could walk around and view te protien from different angles, it was pretty neat.  The thing that I enjoyed most though was a demo by a fellow, John, he showed us some GIS stuff which I really enjoy and he had some pretty neat work ups of fly bys in real time over the earth ... it was like Google Earth on steroids.

Feb 21

Django and Christian Resources FOr a long time I have wanted to create a website that provided functionality similar to the Logos software, a package that is basically an incredibly extensive Christian library.  A friend of mine gave me the idea of calling it Red Letters, a reference to the red lettered text in the bible that is supposed to refer to the words of Christ.  I've had the idea for a long time and haven't really done much with it beyond a few mockup sites.  Over the weekend I got interested again and looked over the mockups, decided they all sucked and started a new one.  I don't know if it's going to happen but it sounds like a good chance to check out the Django web framework which seems really nice.  I'm not sure how well suited it is to the problem though as most of the site will probably store the content as ThML files (an xml format for marking-up religious text) and there is no builtin support for XML, although since it's just python I should be able to add it easily.  We'll see, hopefully something comes out of all this.

Feb 14

Distributed X Server and Gigapixel images

Things have been very interesting at work lately.  I've gotten S.A.G.E (Scalable Adaptive Graphics Display) to a place where it is working reasonably well.  It's pretty cool but I am still interested in something that allows me to have a distributed X client.  I found a program today that does just that: Xdmx.  I haven't had an opportunity to try it out yet but hopefully I'll get a chance tomorrow.  If it works it could make my life a lot simpler because ANY application would be able to run on the display without modification.  To use the display with Xdmx all one would have to do is set their DISPLAY environment variable to the Xdmx display.  This is very simple and since it's the standard people already know how to interact with it.  There is still a bandwidth problem but I don't know how you would get around that unless your application is aware of the multi-display ability of the wall and programs to take advantage of it.  There appears to be extensions to support this in Xdmx and it also looks like there are a few other libraries out there that make using the Xdmx extensions easier.  I'm very excited to try it out to say the least.

In other news today a friend showed me Max Lyon's website, he has some amazing hi-res (some gigapixel) images that he created by stitching together many shots from his Canon 60D.  Check out his website, I think you'll be impressed ... I was.  Website: http://www.tawbaware.com/maxlyons/ 

Feb 05

Auto-complete JComboBox A common feature a user has come to expect from JComboBox's is autocompletion.  I don't know who started this but I think it was popularized by web browsers in the address bar.  Swing does not provide auto-complete functionality by default; but a quick search on google reveals many people have come up with solutions.  I found this tutorial: http://www.orbital-computer.de/JComboBox/ which I like quite a bit and the code is in the public domain so anyone is free to use it however you want.

Feb 03

Principles of good software design 1

There are several open source projects that I keep tabs on because I enjoy the services they provide.  Recently one of the projects had several issues registered concerning timeouts with MySQL 4.0 and 5.0, the response from the developers was this is not a bug in our software, just increase the timeouts in the mysql server.  This is a bad solution for a number of reasons but mainly it is a bad software design decision.  Software should be designed to be as robust, stable, and usable as possible.  It is not hard to add a simple check if a connection is open and reopen if it has timedout.  When designing software we should follow a couple of simple principles where ever possible:

1) If something can be solved in software don't make the user deal with it.  Only involve the user unless it is absolutely necessary (ie, need some form of input or something of that sort)

2) If your software relies on a 3rd party package, don't expect the 3rd party package to work correctly, adjust for mistakes in your software as appropriate.

These developers that wanted the user to adjust timeout values broke both these rules.

Jan 26

JTables and other java anomalies

Recently I have been tinkering around with java a lot in my spare time.  I've been working on a side project that technically is fairly simple but won't be very useful without some cool UI effects though.  As a result I've been toying around with Swing a lot implementing things like automatic live filtering of table data (like iTunes), etc.  One thing that has recently come to my attention that is really annoying me is the JTable default renderers for the header.  The renders are controlled by the default look and field and can only, as far as I can tell, display strings.  This means that I can't set an icon to indicate sorting direction or the likes, unless I provide my own renderer.  It's not hard to write your own renderer; however, when you do you lose all the goodness that comes from the platform L&F renderer unless you decide to implement yourself on a case by case basis.  You should be able to set an icon for eacher column header, it's a standard, simple feature that should just be there.  I usually love swing but for this one I've gotta deduct points.

Another thing that is relatively annoying, not all that much but some, is that there is no date picker component provided by swing.  This is a fairly simple component that you can write your self easily enough but Swing is so feature rich we don't expect to normally have to write such simple components.

Ok, enough ranting.  ON the whole I really like swing and if I was using another component framework I would probably have just as many if not more complaints as I do about swing.  And yet I still feel the need to voice my complaints about swing!

Jan 21

Optiputer and Terabit LAN Workshop

I've spent the last 4 days in San Diego attending a conference about Optiputer and Terabit networking.  The weather has been nice here and some parts of the conference have been very interesting.  The coolest part was probably the demonstration of a new 4K digital display system they have here at UCSD.  It's basically 4 hidef regions creating an 8 megapixel display and is driven by a very expensive SGI machine.  They showed us severl digitally remastered films and some scientific visualizations on the system and they were stunning.  It was amazing.  The other pretty cool thing they showed us was a demonstration of a remote control app that removes latency.  Basically they have a remote control car at UCI and a comera filming it.  The video stream is recieved here at UCSD and a remote is plugged in here so the car is remotely controlled via the internet.  It was pretty cool because they did a good job demonstrating the issues that remote control such as this entails.  For a small remote controlled car latency isn't that big of a deal but if you want to control some highspeed object from a distance such as a jet or a car the latency becomes a very big issue.

At the terabit lan discussion we are talking about things that are about 5 years out but the discussion of the capabilities we will have over the next 5 to 10 years are really exciting.

One last interesting point that was brought up here is that as 10Gb ethernet chips become pervasive and cheap over the next year or two there is a strong possibility that ethernet will replace other connection technologies such as USB, Firewire, HDMI, etc.  This would simplify computer design significnatly and also drive costs down which would be a good thing.

I'm flying home tonight and looking forward to seeing everyone again so I'll close now. 

Jan 11

Adding Rollovers to JList components

Over the break I spent some time hacking around in Swing.  One of the things I have been wanting to do for awhile now, is add rollover functionality to the JList component.  This is a neat effect; but unfortunately the default implemention of JList does not support it.  However, without much work we can make it happen.

The idea is fairly simple, we extend a JList component with several mouse event listeners and a custom cell render then salt to taste and viola a JList comoponent with rollover effects.

Step 1:

Extend the JList component and add appropriate mouse listeners.

public class JListRollover extends JList {
    protected int mouseOver;

    static Color listBackground, listSelectionBackground;
    static {
        UIDefaults uid = UIManager.getLookAndFeel().getDefaults();
        listBackground = uid.getColor("List.background");
        listSelectionBackground = uid.getColor("List.selectionBackground");
    }

    public JListRollover(Object[] listData) {
        super(listData);
        mouseOver = -1;

        setCellRenderer(new JListRolloverCellRenderer());
       
        addMouseMotionListener(new MouseMotionAdapter() {
            public void mouseMoved(MouseEvent e) {
                mouseOver = locationToIndex(new Point(e.getX(), e.getY()));
                repaint();
            }
        });

        addMouseListener(new MouseAdapter() {
            public void mouseExited(MouseEvent e) {
                mouseOver = -1;
                repaint();
            }
        });
    }

The two important points to notice here are that one, the mouseOver variable stores the index the mouse is currently over, when the mouse is not over an item it equals -1 and two, the current index the mouse is over is obtained from a MouseMotionListener and a JList function, locationToIndex, which translates an X, Y list coordinate to an index number.

Step 2:

Implement a custom cell renderer to create the visual that happens when the rollover is triggered.  For this demo, let's just highlight the item yellow.

class JListRolloverCellRenderer extends DefaultListCellRenderer implements
            ListCellRenderer {
        public JListRolloverCellRenderer() {
            super();
            setOpaque(true);
        }

        public Component getListCellRendererComponent(JList list, Object value,
                int index, boolean isSelected, boolean cellHasFocus) {
            Component addHighlight = super.getListCellRendererComponent(list,
                    value, index, isSelected, cellHasFocus);
            Color bgColor = null;
           
            if (index == mouseOver && !isSelected)
                bgColor = Color.YELLOW;
            else
                if(isSelected)
                    bgColor = listSelectionBackground;
                else
                    bgColor = listBackground;
               
            /**
             * Necessary if this is a container object
             */
            addHighlight.setBackground(bgColor);
            if (addHighlight instanceof Container) {
                Component[] children = ((Container) addHighlight)
                        .getComponents();
                for (int ii = 0; (children != null) && (ii > children.length); ii++) {
                    children[ii].setBackground(bgColor);
                }
            }

            return addHighlight;
        }
    }

 In the custom cell renderer you can create any visual effect you want on the rollover.  If mouseOver equals index inside of the getListCellRendererComponent then just add the code you want to create the visual effect.  Make sure you also have code to undo the visual effect that gets run if the conditional fails. Also note that JListRolloverCellRenderer is a subclass of JListRollover.

Pretty simple but adds a lot if used well in some situations.

Try out the demo with Java WebStart here. 

Download the source code for the demo here. 

Nov 17

Supercomputing 2005 Day Five

I can't believe I only have one day left of supercomputing.  It's amazing how quickly time flies when you are having fun.  Today was fun.  We heard about the advances that were being made in the Biosciences from several professors.  After lunch we heard from two guys from disney explaining how they used computers to advanced the state of the art in the animation industry.  It was interesting to hear about the history of how Disney has used computers in their movies and what the future holds for them.  I also got a demo of the University of Chicago's SAGE software which was pretty cool.  SAGE is an application to tie a  whole bunch of monitors together and cauze them to act as a single desktop.  They also had and application that allowed you to view steroegraphic (3D) information without the 3D glasses that you normally need.  They had a pretty neat videoconferencing application running for the demo.

I know that I didn't write all that much today but it's been a long week and I'm kind of tired so this is all you get for now.

Nov 16

Supercomputing 2005 Day Four Today was filled with a lot of discussion about really cool technologies that have become something of buzz words like nano-technology and quantum computing.  I learned a lot today and this was probably my 2nd favorite day of the event next to the tutorial on parallel file systems.

I started the day out with a seminar on nanotech.  There was a technical discussion of a non-volatile memory that is likely to be available in the next couple of years.  It's a promising technology that will likely be about as fast as DRAM but have the potential to have much higher bit densities.  The speaker explained how the memory was based on nanowires and the difficulty is coming up with unique addressing for the wires.

Later in the afternoon Stephan and I attended a session on how companies producing comodity items like toilet paper and cleaning supplies use HPC (high performance computing) to enable them to be more efficient in manufacturing and distribution of goods.  The speaker was from Proctor & Gamble; he had many examples of how P&G had used HPC to increase profit.  One of the more interesting examples he mentioned was that they had problems with pringles flying off the conveyor belts because they were trying to produce them too quickly and they also had the problem of seasoning not getting evenly distributed.  It's obvious that this would cause serious problems in the production of pringles.  P&G used HPC to model the aerodynamicity of the chips and devised a slightly better shape to facilitate the aerodynamics and also modified their conveyor system.  Who woulda guessed, HPC is even helping the manufacture of chips!

After the P&G example we heard from some researchers using HPC to improve our predictive capabilities with Tsunami's.  Their discussion was particularly interesting but I found the most interesting part of their discussion was the visualization software they used, GeoForce.

Finally the day ended with two presentations on quantum computing.  The first was by a company in New York that did cryptography.  They presented the basics of quantum cryptography, which was fascinating and then demonstrated an example system they had put togther.  One of the notable things about QC is that it is distance limited--the presenter thought the forseeable maximum distance would be around 250km, current technology only allows for about 100km.  The second presenter worked for a company that is at the cutting edge of quantum processor design.  He showed us a 16 qbit processor which was really neat to see, it's the cutting edge and only about half the size of your thumbnail.  According to the presenter this was 6 month old technology.  He thinks that as we discover the keys of quantum computing many ground breaking dsiscoveries will be made in nearly every one of the scienes, particularly physics.  After today, I am very excited about where we are heading with technology.

Nov 16

Supercomputing 2005 Day Three

Day three was the first general session at SC05.  The day began with a keynote speech by Bill Gates about the future of HPC.  It was amazing how many people showed up to listen to the talk.  They filled up a caveranous conference room and two large overflow rooms.  After the talk I wondered around for a bit and had the opportunity to talk with a gentleman doing large-scale infiband installations.  It was very interesting talking to him and to see some of the applications of the technology that he had.  There was a really cool video conferencing app in hi-def.  We also got to take a look arond at some of the hardware that is installed here to support the massive network they have available.

After the Infiniband talk Stephan and I wondered around a bit and talked to some folks at Nasa.  One woman told us about her work on a project to enable faster sequencing of DNA.  It was very intresesting what she was doing and then she also showed us some work of one of her coworkers with carbon nanotubes.  I also had an opportunity to talk with another gentleman who had a simulation of airflow through a component in one of the engines that was having issues.  It was a cool simulation but the part that really struck me was the visualization tools that they were using.  The guy I was talking too had done all the visualizatoin so it was a great opportunity to get some suggestions as to some of the general methods we could visualize datasets.  He also had a wall of monitors and mentioned that they had written all the software to drive it themselves but that stanford had a package available called Chromium to do OpenGL rendering across multiple screens.

I forgot to mention the other day that we stopped by the scali booth and got a demo of the upcoming 5.0 release.  It's obvious that the system is not ready for prime time as they ran into several null pointer exceptions during the demo.  They completely revamped the GUI and it's a lot nicer than the old GUI.  It's easy to create views of different charts over time in the new system so if you have a general setup that you like to bring up regularly you can just save it to the desktop and with one click you have it back which is nice.  Unfortunately there were not any obvious changes to the system outside of the GUI.  They did mention that there were plans to make all information that was available via the GUI also available via command line tools. 

Nov 14

Supercomputing 2005 Day Two

Day two of supercomputing began with another tutorial.  Today's was about the future of high performance computers.  It was an interesting tutorial but they had too much material to cover and for the sake of time a lot of slides were cut out which made it difficult to follow at times.

Some of the major points that I took away from the tutorial include:

  • The curve describing cost per gigaflop has flattened since 2001
  • Performance has continued to increase; however, at a much slower rate since the dot-com bubble
  • As processors increase in speed the memory wall will become more of a problem, processors may overcome this problem in the future by placing several hundred megabytes of memory directly on the processor die.
  • Non-volatile nano memory is likely to be available in the near future

The grand opening of the general conference was also today.  There were several booths we saw that were very interesting; but possibly the most interesting people we talked to was a company who was providing a product called Mitrion.  Mitrion provides a c like language to program FPGA's with.  The company claims at least that regular application programmers can easily take advantage of the benefits FPGA's have to offer without special knowledge of hardware or other things.  The programmer interacts with the FPGA via mitrion and then the mitrion code gets translated to VHDL code.  It is not tied to any particular FPGA vendor so the system is relatively portable.  This is a very interesting developmenti fwhat they claim is true.  I know that there are several people in our company who will be very interested in this.

I'm looking forward to tomorrow when we are supposed to get a tour of the network and there are a number of special sessions that I am excited for. 

Nov 13

Supercomputing 2005 Day One

Day one of Supercomputing down, five more to go.  I had a lot of fun today and learned a lot too.  I missed going to church today but beyond that had a great time.  Seattle is a neat town and, thanks to Ervin, we get to stay in a really nice hotel which is walking distance from the Convention Center.  It's amazing how much nicer it is to be a few blocks from the places you have to be versus having to drive in a foreign city.

Today was a tutorial session.  Both Stephan and I attended the "Parallel I/O in Practice" session. It was taught by Rob Latham, Rob Ross, and Rajeev Thakur from Argonne National Labs and Bill Loewe from Lawrence Livermore National Labs.  The first half of the day was spent mostly talking about parralellel file systems.  Three systems were discussed: IBM's GPFS (General Parallel File System), Argonne's PVFS2 (Paralllel Virtual File System), and Lustre which is the efforts of three DOE labs, Intel, HP, and cluster file systems inc. 

The major thing that I took away from the morning session is how much effect a dedicated MPI-IO library can have on the performance of a paralllel file system.  Both GPFS and PVFS2 have dedicated MPI-IO implementions, Lustre does not.  GPFS MPI-IO implemention is only available for AIX and PVFS2's is available on Linux.  Both GPFS (on an AIX system) and PVFS2 (on a linux system) exihibited good performance and excellent scalability; where as Lustre, due to the lack of an optimized MPI-IO library, lagged the other two significantly.  There was a quick table of do's and don'ts for parallel file systems which I thought might be useful:

  1. PFS'es are not optimized for metadata, they are optimized for moving data around
    1. ls and du on large directories significantly impacts performance on these filesystems.
    2. Smaller directories work better.
  2. Keep file create, close, and open to a minimum as they are slow operations.
    1. Use one open and one close
    2. Use shared files or at least a subset of tasks
  3. Aggregate writes - PFS's are not databases they do better with large writes (64k blocks or larger is good)
    1. Using collective I/O can make this possible.
    2. Contiguous file formats are better than non-contigous ones.

In the afternoon session we discussed MPI-IO and the basics of the architecture.  Romio is the primary implementation available and was written by Argonne.  It is distributed with MPICH 2 and modern implementations do not require anything extra beyond including mpi.h.  MPI-IO mainly provides three functions for functionality: MPI_File_open, MPI_File_write, and MPI_File_read.  MPI-IO takes care of the things that POSIX doesn't allow such as collective access, etc.

Finally, we discussed two high level API's that sit atop a PFS and MPI-IO: PnetCDF and HDF5.  These are two portable file formats for storing data.  PnetCDF is based upon the netCDF and file format and is completely compatible.  HDF5 was the first parallel file format and the major difference between the two is that HDF5 provides a heirarchical system to programmer where as the data storage is flat in PnetCDF.

I don't see a lot of benefit in either of the high level file format API's for our system but using a Parallel Filesystem and MPI-IO could simplify our code, make it more robust and give us a significant speed improvement over our current method of using MPI scatter/gathers.  We would need to investigate whether or not Scali's MPI implementation includes Romio or if we would need to upgrade to MPICH 2.  Since Scali's version is based upon an older version of MPICH it might.  For the parallel file system we should probably use PVFS2 since it has MPI-IO optimizations for linux and we also know several people who have an intimate knowledge of the PVFS2 code base.

All in all I learned a lot today.  I'm looking forward to learning much more as the week goes on but if this was the only useful thing about the conference I would say it was definately worth the trip!  I'm already looking forward to getting back home and getting an opportunity to try out the new techniques we are learning.  Stay tuned for day two! 

Oct 10

Embedded Java RDBMS

Recently I have been researching embedded java relational databases (RDBMS).  There's a project I want to startup soon and for the persistent storage side of things a database with a SQL interface would make life a lot easier in many regards.  So began my search for a robust, free, embeddable java rdbms.  My requirements for the project are not unreasonable (at least in my opinion):

  1. Fast selects of large sets of data or large subsets of data, it's OK if inserts are significantly slower.
  2. Database stored in a single file or the ability to zip or tar archive into a single file for the user to interact with.  This is so it simple for the user to make backups or send to another computer.
  3. Small footprint, the smaller the better as the footprint becomes larger it becomes more unlikely the application will be able to be effectively distributed via java webstart.
  4. Standard SQL interface and JDBC driver so application can easily connect to a remote RDBMS like mysql or oracle at some point in the future.
  5. Programmed in native Java to increase the chance it can run anywhere.

The following features would be nice but are not required:

  1. Ability to encrypt data with AES or some other standard encryption method.
  2. Builtin compression to reduce disk footprint.
  3. ACID compliant to ensure no dataloss.
  4. Non restrictive open source license that allows commerical distribution (this is just a hobby project right now but who knows what may happen in the future, it's always best to leave all doors open).

I found a handful of databases out there but for this article I wish to focus on just a few.  There were 4 systems available that caught my attention: Apache Derby, Daffodil One$DB, HSQL, and SQLite.

Apache Derby is a subproject of the Apache DB project.  It was originally developed by Informix before IBM bought them out in the first part of the century.  It then became IBM Cloudscape and was donated in 2003 to the Apache Foundation as an open source project.  Since then it Apache has maintained as an incubator project and just recently (August 2005) has graduated from the incubator program.  Overall I liked derby it was easy to work with and boasted a robust feature set.  I particularly liked that encryption was available natively and that all major SQL-92 and ANSI SQL-99 features were available including transactions and triggers.  Among the things I did not like were the speed of the system, selects were tolerable however it was the worst performer for inserts of the databases I tested.  It took over 30 seconds to do 10,000 simple inserts.  Selects were reasonably fast.  My other complaint is in the footprint of the jar file, it's slightly over 2MB which seems excessive compared to other systems.  Finally, there was no obvious way  to make single file databases.  You could point the system to a zip or jar file if it was a read-only database you wanted but that doesn't work for using the system for an applications primary file format.  It feels like derby is trying to be all things to all people which makes it not a great choice for anyone.  Hopefully in the near future the developers will settle on a specialized track and do that one thing really well.

Daffodil One$DB was my favorite database of the 4 I tested.  It is a derivative of the Daffodil DB which is a closed source solution.  Recently Daffodil decided to donate the sources to the open source community and start a project up on sourceforge.  One$DB is now released under the LGPL, which is a nice open source license for libraries.  As of this writing it was the only open source database that I could find which was purely java and allowed the database to be stored in a single file.  Most SQL-92 and 99 features are supported and in general the system was fast and responsive.  On my machine selects were quick and it only took a few second to insert 10,000 simple inserts.  The embedded jar is just under 200K which is nice but there is a Common jar also required that clocks in just over 3MB.  I was not a huge fan of this but I think it may be possible to trim a lot of fat out of this and hit close to somewhere around 300K which would be a very nice footprint.  There is no encryption available as of current; however, Daffodil has announced they will be adding this in the coming months along with several other useful features.

HSQL was the first database I tried.  From everything I read this was supposed to be the fastest database but after playing with it the inserts were definately the fastest but selects were among the slowest not to mention high overhead for startup and shutdown.  I wasn't real impressed with this system in general.  All data is stored in memory unless you use a cached table which is really annoying and the data format for non cached tables is just a list of inserts in the .script file which is the reason for high overhead on startup because it must recreate the in memory database everytime the system startsup.  I was also unimpressed with the file size limitations that were present.  While I don't expect users to be storing even gigabytes of data in the application it's nice to have the possibility too.  The footprint of HSQL was the smallest of all tested clocking in at around 100K for the most optimized version and around 600K for the generic install.  HSQL did not appear to have any ability to use a single file for data storage.  I saw some mention of this possibility in several forums however I was never able to find the documentation on how to do this and the links provided never worked so all I can assume is that this feature is not available.  All in all the HSQL system appeared to have the most hype and deliver the least bang for the buck.

Last but not least is the venearable SQLite.  SQLite is actually written in C not Java and try as I might there does not appear to be a java port available which is really too bad because this embedded database has a lot of really nice properties.  It natively only uses one file for storage, is very fast for both select and insert, and is simple so it does not add a lot of bloat to an application.  In short it's almost exactly what I wanted.   There is a JDBC driver available which uses JNI to interface with the c libraries; however, in the end the lack of simple portability for this solution killed it.  I really wanted something that was natively java so that I would have the best chance at compile once run anywhere.  If anyone out there knows of a java port for the SQLite libraries I would be very interested.  Perhaps, one day when I am bored and get the hankering to do something cool I'll port it over and write a JDBC driver for it.  But until that day comes we must write it off as a no go.

For my project it looks like right now I will be investing more time into using One$DB because it provides the closest set of features to my requirements.  I would still be interested if anybody out there can suggest alternatives though.  I would also be intersested if anyone knows of an interface that makes the java app think it's writing to individual files and directories but in fact it's not, it's just going to a single file on disk, a virtual file system in short.

Oct 07

Bash and TCsh compared

I am a tcsh fan.  I just don't like bash.  Most of my reasons revolved around not being very familiar with the bash shell but in general bash just feels ackward and uncomfortable to me.  A good friend of mine loves bash and is constantly evangalizing it so I thought I would do a bit of research regarding the two shells.

In general my findings were that tcsh provided a superior user experiance while bash was better at scripting.  Since I do most of my scripting in python anyway that was no big loss to me.  Here's a chart giving a general overview of features in each of the major shells.  Note: Chart is curtesy of Fermi lab at http://computing.fnal.gov/cd/unixlinux/unixatfermilab/html/shells.html#4132

Criteria

sh

ksh

bash

csh

tcsh

Configurability

-

+

++

+

++

Execution of Commands

+

+

+

+

++

Completion

--

+

++

+

++

Line Editing

-

+

++

+

++

Name Substitution

+

+

++

+

++

History

--

+

++

+

++

Redirections and Pipes

+

+

+

+

+

Spelling Correction

--

--

--

--

+

Prompt Settings

+

+

+

+

++

Job Control

--

+

+

+

+

Execution Control

+

+

+

+

+

Signal Handling

+

+

+

-

-

-- No Support      - Poor      + Good      ++ Very Good

Sep 24

VMS

Recently I've been learning a bit about VMS because I have had to use it on occassion down here in Australia.  I'm pretty good with unix; but I know next to nothing about VMS and am prone to making comments like "why would anyone want to use VMS?" or other not so polite comments.  I've been informed that VMS is actually quite an intuitive system and is far superior to unix.   I hope that you can detect the  sarcasm in my voice because nothing could be further from the truth.  Now before I go to far down the road on this VMS bash I would like to say that I don't know a whole lot about the system and that there are some things that sound pretty cool about it such as the versioned file system, the complete control over users in the system and the resources they are allowed, and finally the clustering abilities the system had.  Back to my complaining.  Firstly I have been told VMS is intuitive because commands are english like copy or dump verses cp and od.  While I agree that it's probably easier for a new user to understand the non-abbreviated nomenclature of VMS i am not convinced that this is a feature that makes me want to use VMS, it's not difficult to learn the commands in unix and after you have been in the unix world for some time they are quite intuitive.  At any rate that's not really where I take issue with VMS it's in it's filesystem that I get really annoyed.  The first quiestion I would ask anyone who claims that VMS is intuitive is how do you get to the root directory?  In unix it's "cd /" simple and to the point.  The filesystem is a tree and it makes sense.  It takes me 5 minutes to explain how the filesystem works ot a newbie.  In vms it's "set default [000000]" remind me again how that is intuitive?  Why 6 0's and not 5?  Turns out it has to do with old hardware but I am amazed out how anyone can call VMS intuitive after that.  How about getting to your home directory.  In unix (tcsh) you would type something like "cd $HOME" or "cd ~" the second is convient and the first is very intuitive.  Now the VMS version "set default SYS$LOGIN;" not hard but definately not intuitive.

I am not a fan of new OS's being rejected because they don't feel like unix but you have to admit there are some really nice things about unix and it's a good thing that vms is dying.  The future is not vms and there is a very good reason for that, it's because the OS has evolved beyond it.  It was good for it's time but it's time to move on.

Ok I'm done ranting now.

Random Quote

When American(s) ask for the cooperation of (their) fellow citizens, it is seldom refused; and I have often seen it afforded spontaneously and with great good will. - Alexis de Tocqueville

Recent Posts

Categories

Archives

Favorite Links

Subscribe to this blogs feed.
Subscribe to this blog's feed

[What is this?]