PAW - Process & Analytics Workbench

Playing with Data in PAW

This blog is about all the thoughts, features, decisions, and musings going into creating PAW - the Processing and Analytics Workbench for managing your data.

Monday, November 30, 2009

Cut'n'Paste matters

The primary use case we've been targeting is where users want to build a repeatable process which involves pulling data from a reliable external source (it may need to be done again and again) and massaging it, perhaps w/ some other data, and then doing something w/ that data (loading it to a database, calling a web service, sending emails,...). Now our focus is shifting to how do we get users to be more productive once they have their data in PAW.

Filtering has been made simple, more than the auto-filter capability in spreadsheet tools. Choosing fields and changing their order was another must-have when you're analyzing data with lots of columns. And then finally, there's cut-n-paste.

Cut and paste is actually the primary mechanism users use to move data around, typically small bits of data, maybe used as a reference table for mapping or some departmental data needing slicing and dicing. We now allow the user to create a new table with the data or to paste it into an existing table.




First you paste your data and specify what table to load the data into. Then you map the fields to existing columns or specify new fields and their types.




And voila - you're done!


We've made the feature as simple as can be, but underlying it needs to ensure all the fields can be converted correctly. Pasting into an existing table is just as simple. I am looking forward to feedback on how this feature change what types of data you work with on PAW.

Friday, July 17, 2009

Technocrati, here we come

eqtk34c7ry

Friday, July 3, 2009

Automatic Updates for PAW, finally!

The web has ofcourse completely changed expectations for the feedback loop between customers and suppliers. In the software business, that means that you need to be able to turn around requests for features and bug fixes much much more quickly than 6 month or 1 year cycles.

For installed software, a key aspect of meeting this need is automatic updates. You're constantly getting bug fixes out and incrementally adding features based on user demand. The first requirement is the ability for the customer to very easily obtain an updated version of your product. Still a lot of customers will not bother to check for updates on a periodic basis. Bugs in earlier versions can be a huge stumbling block to people continuing to use your product. Without some sort of automated prompt to get an update installed, your customers are typically not seeing all the work your company is doing to better the product.

So, we just implemented automatic updates for PAW! It is seamless, it is robust, and it just works!

Automatic updates need to be a part of any platform for building software products. Eclipse RCP (Rich Client Platform) provides an excellent foundation for building software. With their latest release of Galileo, the automatic updates support is finally there with Equinox P2. We continue to be happy to have chosen this platform for building out PAW.

Thursday, March 12, 2009

Performance Tuning for PAW UI

As we near the final release stage for our version 1.0 product, the last few releases will be focused on performance. We have been tweaking the UI, so that the user gets into the product and is able to start using it easily. This caused us to move a lot of the functionality to a web-based paradigm - using browser controls for the menuing, displaying data, as well as performing certain actions (filtering, sorting).

Web-Based Data Display
So we decided to focus on the web-based data table as the primary go-forward option. We made it the default for showing large sets of data resulting from running processes. It was already the default for tables that one can edit, but processing results tend to be much larger and hence it is imperative that the user can see and navigate through the data quickly. Hence the focus on performance.

In terms of performance enhancements, we reduced the time to display large datasets from 50+ seconds for 3k+ rows (when using the web-based rich table) to 7-8 seconds. This is inline with the native windows table w/o the rich functionality (multiple lines in a table, filtering, etc..).

This was accomplished using DocumentFragment for DOM manipulation so that we are not causing the browser to re-flow the document when each cell is added. Many thanks to John Resig (http://ejohn.org/blog/dom-documentfragments/) for this helpful hint. It made a almost 10x improvement, beyond the 2-3x improvement that's been noticed for smaller amounts of DOM manipulation.

Interactive UI
Process runs have also been moved to being "jobs" which run outside of the UI and then update the UI once the data has been refreshed. This allows the user to continue to perform other actions. Again, this makes the application more responsive to user input. Similarly, we have added a preview when importing data from the web, allowing the user to see where a page might have changed in case the web import isn't working as expected. This is also provided in a separate window.

As next steps, we will be looking at tuning some of the more resource intensive operations like merging large datasets through the use of more advanced algorithms and more specialized use of RAM.

Monday, March 2, 2009

Providing simple and advanced options

It has been a long time since I've posted. Been hard at work looking at understanding our users and what issues they might be having - user research. One of the things that we found with software as powerful as ours, is that people need an easy starting point. Imagine if you saw Excel for the first time and were trying to figure out what to do.

So we've been busy adding more user-focused questions through pop-ups asking users if they would like X or Y ("would you like to re-run your process now?"). Most importantly, we've added a user guide view which distills the essence of what PAW enables and provides a focused task pane for just those functions. We feel so positive about this that we've made it the default starting view when users start the application!

Existing advanced view:



And the new simpler user guided view:




Please send us your feedback as we try to make further improvements!

Tuesday, August 26, 2008

The Perfect Match

Today at work I was interviewing some candidates for a position we have been trying to fill for quite some time now but haven't been able to find a good match. This experience got me thinking can PAW be a "perfect match"?.

Today was also billing day, the two days in the month when I am drowning in numbers, pulling multiple time sheets for all my direct reports working with multiple excel worksheets. In these two days I will have worked over 10-12 hours per day pulling reports, summarizing them, matching and merging finally building out my worksheets to be sent to my finance department for billing.

Month after month there is all this manual labor involved in putting together these workbooks. This prompted me to think of getting a software product that would help reduce it. It would be so nice to get back 4-6 hours. A 50% in time saving ---> SWEET. I opened PAW started to build my monthly time reporting worksheet from scratch. It took me about an hour to set-up and automate the process the way I wanted it. I tested it and it worked. I wondered if there was something out there that could do everything I had in my head, PAW only addresses 85% of the steps I envisioned.

On the drive back home I realized that next month I will only have to do 15% of the work as PAW will automatically do the balance 85% and export the excel file to me, that made me so happy. In today's world of instant gratification where software only gets cheaper, faster, better one has to realize "the perfect match" is also changing constantly, whats in today is out tomorrow, whats considered hot today is dead tomorrow. I now look at PAW in a new light my PERFECT MATCH light my solution for today, tomorrow will bring something that is faster cheaper better. Until then I have found my PERFECT MATCH.

Wednesday, August 20, 2008

Marketing Challenge

I have been working in the field of marketing for quite a few years now. Most of my job involves interaction with data in some level or another. Some things I regularly working on are - getting data files of customers for my next email campaign, status of payments received from a particular client, or simply an export of all the responses from my marketing survey.

Every time I get data I have to spend time doing some processing to it so that I can visualize it in a fashion that is understandable to me. Today I was faced with such a situation and PAW came to my rescue.

Today, I received twelve files of data exported in excel from a vendor who implemented one of my customer survey's. All of the raw data I collected was important to me however in addition to this raw data there was more value in analyzing the following -

  1. 1. How many of my customers completed more than one survey?
  2. 2. What is the number of new customers to the email file vs. my control?
  3. 3. Isolate email addresses that have subscribed to my monthly newsletter
  4. 4. Isolate the email addresses that have shown an interest in my next webinar

Traditionally, I had to have someone from my team or the database manager right some SQL query's to parse this information and send it to my team. In the PAW world today, I or some one from my team pulls all the information in PAW and withing the hour has all the various data files ready to send to the various teams.

Merging or de-duping data takes minutes in PAW even for large files (each of my files had over 35k rows in it) and what's even better is that I don't have wait for other teams to send me the parsed data. My email marketing campaigns are more releveant as they are sent out in a timely fashion bringing more ROI and revenue to PAW.