PAW - Process & Analytics Workbench

Playing with Data in PAW

This blog is about all the thoughts, features, decisions, and musings going into creating PAW - the Processing and Analytics Workbench for managing your data.

Saturday, February 12, 2011

Auto-Update Again

We have been using the Eclipse RCP framework as the basis for PAW. It comes with auto-update capability. However, as we started to use it, it becomes very complex- since as the developer, it's very hard to understand how the components are managed and when there's something wrong - what is wrong. We were also having to release changes as both updated installers as well as through the update mechanism for Eclipse. After a couple of issues that I spent weeks trying to debug and fix, I finally gave up on using the Eclipse P2 mechanism - it's too complicated, adds about 5-8MB to the download, and was making me spend more time on the framework than working on the product functionality.

Now, we're using the Eclipse OSGI mechanism to do installs and updates. It's much simpler to understand (the bundles are in clearly marked directories) and the API to make updates is much simpler.

TO INSTALL:
BundleContext.install('http://myserver/bundles/mybundle-1.0.0.jar')

TO UNINSTALL
bundle.uninstall()


The final program structure I came up with has a launcher which also comes with an installer. This runs with the OSGI bundle and some SWT libraries to show progress in a UI. It is basically something like this..


// create the framework
ServiceLoader loader = ServiceLoader.load(FrameworkFactory.class);
FrameworkFactory ff = loader.iterator().next();
Map config = new HashMap();
config.put("osgi.configuration.area", configDirectory);
config.put("osgi.user.area", workspaceDirectory);
config.put("osgi.instance.area", workspaceDirectory);
fwk = ff.newFramework(config);

// start it
fwk.start();

// do any installs and uninstalls
ctxt = frk.getBundleContext();
ctxt.install().....

// start the right bundle, then get the application and launch it
Runnable app = (Runnable)getService("myapp.id") // THIS IS PSEUDOCODE
app.run();

// finally stop the framework
fwk.stop()



That was roughly it to get the code working, and reduce the client footprint by 8MB - of Eclipse plugins that I wasn't using!

Monday, November 30, 2009

Cut'n'Paste matters

The primary use case we've been targeting is where users want to build a repeatable process which involves pulling data from a reliable external source (it may need to be done again and again) and massaging it, perhaps w/ some other data, and then doing something w/ that data (loading it to a database, calling a web service, sending emails,...). Now our focus is shifting to how do we get users to be more productive once they have their data in PAW.

Filtering has been made simple, more than the auto-filter capability in spreadsheet tools. Choosing fields and changing their order was another must-have when you're analyzing data with lots of columns. And then finally, there's cut-n-paste.

Cut and paste is actually the primary mechanism users use to move data around, typically small bits of data, maybe used as a reference table for mapping or some departmental data needing slicing and dicing. We now allow the user to create a new table with the data or to paste it into an existing table.




First you paste your data and specify what table to load the data into. Then you map the fields to existing columns or specify new fields and their types.




And voila - you're done!


We've made the feature as simple as can be, but underlying it needs to ensure all the fields can be converted correctly. Pasting into an existing table is just as simple. I am looking forward to feedback on how this feature change what types of data you work with on PAW.

Friday, July 17, 2009

Friday, July 3, 2009

Automatic Updates for PAW, finally!

The web has ofcourse completely changed expectations for the feedback loop between customers and suppliers. In the software business, that means that you need to be able to turn around requests for features and bug fixes much much more quickly than 6 month or 1 year cycles.

For installed software, a key aspect of meeting this need is automatic updates. You're constantly getting bug fixes out and incrementally adding features based on user demand. The first requirement is the ability for the customer to very easily obtain an updated version of your product. Still a lot of customers will not bother to check for updates on a periodic basis. Bugs in earlier versions can be a huge stumbling block to people continuing to use your product. Without some sort of automated prompt to get an update installed, your customers are typically not seeing all the work your company is doing to better the product.

So, we just implemented automatic updates for PAW! It is seamless, it is robust, and it just works!

Automatic updates need to be a part of any platform for building software products. Eclipse RCP (Rich Client Platform) provides an excellent foundation for building software. With their latest release of Galileo, the automatic updates support is finally there with Equinox P2. We continue to be happy to have chosen this platform for building out PAW.

Thursday, March 12, 2009

Performance Tuning for PAW UI

As we near the final release stage for our version 1.0 product, the last few releases will be focused on performance. We have been tweaking the UI, so that the user gets into the product and is able to start using it easily. This caused us to move a lot of the functionality to a web-based paradigm - using browser controls for the menuing, displaying data, as well as performing certain actions (filtering, sorting).

Web-Based Data Display
So we decided to focus on the web-based data table as the primary go-forward option. We made it the default for showing large sets of data resulting from running processes. It was already the default for tables that one can edit, but processing results tend to be much larger and hence it is imperative that the user can see and navigate through the data quickly. Hence the focus on performance.

In terms of performance enhancements, we reduced the time to display large datasets from 50+ seconds for 3k+ rows (when using the web-based rich table) to 7-8 seconds. This is inline with the native windows table w/o the rich functionality (multiple lines in a table, filtering, etc..).

This was accomplished using DocumentFragment for DOM manipulation so that we are not causing the browser to re-flow the document when each cell is added. Many thanks to John Resig (http://ejohn.org/blog/dom-documentfragments/) for this helpful hint. It made a almost 10x improvement, beyond the 2-3x improvement that's been noticed for smaller amounts of DOM manipulation.

Interactive UI
Process runs have also been moved to being "jobs" which run outside of the UI and then update the UI once the data has been refreshed. This allows the user to continue to perform other actions. Again, this makes the application more responsive to user input. Similarly, we have added a preview when importing data from the web, allowing the user to see where a page might have changed in case the web import isn't working as expected. This is also provided in a separate window.

As next steps, we will be looking at tuning some of the more resource intensive operations like merging large datasets through the use of more advanced algorithms and more specialized use of RAM.

Monday, March 2, 2009

Providing simple and advanced options

It has been a long time since I've posted. Been hard at work looking at understanding our users and what issues they might be having - user research. One of the things that we found with software as powerful as ours, is that people need an easy starting point. Imagine if you saw Excel for the first time and were trying to figure out what to do.

So we've been busy adding more user-focused questions through pop-ups asking users if they would like X or Y ("would you like to re-run your process now?"). Most importantly, we've added a user guide view which distills the essence of what PAW enables and provides a focused task pane for just those functions. We feel so positive about this that we've made it the default starting view when users start the application!

Existing advanced view:



And the new simpler user guided view:




Please send us your feedback as we try to make further improvements!

Tuesday, August 26, 2008

The Perfect Match

Today at work I was interviewing some candidates for a position we have been trying to fill for quite some time now but haven't been able to find a good match. This experience got me thinking can PAW be a "perfect match"?.

Today was also billing day, the two days in the month when I am drowning in numbers, pulling multiple time sheets for all my direct reports working with multiple excel worksheets. In these two days I will have worked over 10-12 hours per day pulling reports, summarizing them, matching and merging finally building out my worksheets to be sent to my finance department for billing.

Month after month there is all this manual labor involved in putting together these workbooks. This prompted me to think of getting a software product that would help reduce it. It would be so nice to get back 4-6 hours. A 50% in time saving ---> SWEET. I opened PAW started to build my monthly time reporting worksheet from scratch. It took me about an hour to set-up and automate the process the way I wanted it. I tested it and it worked. I wondered if there was something out there that could do everything I had in my head, PAW only addresses 85% of the steps I envisioned.

On the drive back home I realized that next month I will only have to do 15% of the work as PAW will automatically do the balance 85% and export the excel file to me, that made me so happy. In today's world of instant gratification where software only gets cheaper, faster, better one has to realize "the perfect match" is also changing constantly, whats in today is out tomorrow, whats considered hot today is dead tomorrow. I now look at PAW in a new light my PERFECT MATCH light my solution for today, tomorrow will bring something that is faster cheaper better. Until then I have found my PERFECT MATCH.