Tuesday, July 29, 2014
Monday, May 27, 2013
Wednesday, February 13, 2013
- Regular Expressions where used to find patterns and remove and alter to enable a standardization of names and addresses
- Equi-joins and other join types to match
- Soundex or Metaphone function in combination with other matches to enable fuzzy matches
- Jaro-Winkler, Levenshtein and Distance functions for fuzzy matching
- ETL Tool Functionality which further extends the base database functionality
I will be presenting this solution at COLLABORATE13 in Denver in April, and this entry should help you as you consider an alternative approach to matching which will be critical to your MDM solution.
Tuesday, October 23, 2012
Last week I attended Oracle OpenWorld. This was at least my 10th time attending the event and it continues to grow and change as Oracle does. One thing that does not change is San Francisco, which is always amazing and this year was sunny and warm.
The conference is the annual gathering hosted by Oracle and is the place where the company gets to talk about what is new and what is influencing our businesses. So Oracle made many announcements during the week related to Cloud, Big Data, new hardware and a new release of the Database. Each is hot these days and Oracle continues to bring new and updated offerings to the market to meet this changing landscape.
In the Big Data space as with the Cloud space this year was a year to encourage the adoption and use of the technology. In the area of Cloud Oracle offered plans for people to migrate from in-house systems to the Cloud. They discussed strategies on how to make the transition as easy as possible. The challenge I heard about companies moving to the Cloud has been the fact that the systems people have are old or have been significantly customized and the move is not one which is simple or straightforward. This move for some companies will not be as easy as was described.
The big news for news for me was the fact that Oracle had officially released a new version of the database; version 12c is on the way. This new version has a number of enhancements the biggest for me was the concept of the pluggable database. The pluggable database is a feature which provides significantly better support for databases to be better able to react to hardware, platform and version changes. The pluggable database can be easily moved to another container database which can have many pluggable database within it. Of course Oracle made other changes to the new version of the databases but this was the most significant as it changes the underlying architecture of the database
Of course as usual the big buzz was about Big Data. Oracle continued to sell the concept and help customers to see how Big Data can help. This year the idea transitioned from theory to practice and experiences. People were now discussing use cases (as I did during my Big Data presentation). The why is moving to the how. Below is Andy Mendelsohn, Oracle Senior VP Databases telling us about the Oracle stack for Big Data and how the Big Data Appliance can help.
Overall the event was the usual offering for Oracle OpenWorld which helped many to better understand what is coming and how we need to get ready for it.
Tuesday, September 11, 2012
It’s that time of the year again. The summer is slowly coming to a close. The evenings are getting colder and the days shorter. The other thing that arrives at this time of the year is Oracle OpenWorld; the annual gathering of Oracle customers hosted by Oracle. It all starts on September 30th, when I and lots of other Oracle professionals will descend upon San Francisco for the annual event.
I have to admit one of the great highlights of the week, is the very first event of the week. This is the IOUG at the User Group Sunday event where the user group starts the week with presentations , discussions and panel on the deepest parts of Oracle’s technology, where users share their stories and experiences. I will be presenting a seminar about Big Data: the Future is Now. OpenWorld is the place where I get to meet old friends and colleagues and hang out with my cool cousin (who lives in San Fran), it is one of two gatherings of users and it is the place I renew many friendships. It is like a geek pilgrimage.
The remainder of the week is all about Oracle. It is at OOW where we get to hear from Larry Eliison and listen to his vision for the future. We hear about some of the new technologies which will become part of our fabric in our future. I remember hearing about Big Data a few years ago as a concept and now it is becoming mainstream. I will be speaking during the week about how Agile has helped EPAM to deliver Big Data projects; a very exciting topic these days to allow for effective creation of data and reporting solutions. And of course there are the networking events… this year we even get to see Pearl Jam. And then some Oracle Music festival which includes Macy Grey. How do they do it?
So why not come by and see all of us in the user groups and become part of the fun?
Here are Ian’s Top 5 Benefits of the IOUG at OpenWorld
5. Best directions to sessions
4. IOUG helps people separate reality from hype… after 5 drinks.
3. Get to finally meet TV star.. John Matelski. Looks like he may be the next host of Meet The Press!
2. Coolest t-shirts
1. Special IOUG lines at all food and drink counters for all OpenWorld events!
So I hope to everyone at the event. You will find the IOUG booth at Moscone West in the user group pavilion. See you there.
Thursday, May 24, 2012
There are times in your life where you get a chance to experience something different and exciting and last week was one of these experiences which I will not soon forget. Last week I had the opportunity to be part of a Business Intelligence event being held in Minsk, Belarus by my new company EPAM.
The chance to go half-way around the world to speak about my favorite topics was at the same time exciting as it was scary. I travel a lot for work and pleasure but this was different. The countries of the former Soviet Union have always held a special connection with me; as my grandparents were from Latvia and the Ukraine and Minsk was right in the middle of both. I didn’t know what the trip might hold; would I be able to get around without speaking any Russian? Would they like what I had to say? At least we had some commonalities like hockey, the weather and our love of data. All my fears quickly dissipated once I finally left the airport. The Minsk airport is still a bit of a throwback to the days of Soviet rule.
The airport has only 6 gates and no lines. When you arrive in Belarus you need to purchase medical insurance ($2 Euros) and of course meet with Passport control to get final clearance into the country. This was the first time I travelled to a country which required a visa, and although before my trip I was anxious about entering the country it was a smooth trip to Minsk
I was headed off to visit the team at EPAM. I had a trip which took me through the countryside and into the city. I was struck by how much the Minsk area looked like any Canadian place. The forests of white birches was a welcome site. The cars that they drive there are no different from ours, but what was different was the grandeur of the architecture and how it seemed like a modernized version of the old Soviet Union. The streets in the city core are wide and grand. Below is an image of Independence Square which houses the Belarusian government and a huge shopping centre which is right below the square.
This is what I expected, these are the type of buildings which I pictured in my mind. The biggest realization was that the shops were fully stocked with goods, much like we have in Canada. The brands might be different and you can buy vodka for less that $10, but this is a country where success is coming as they evolve from their modest past. This is a country that has welcomed the new age and are working to bring and grow.
The visit to EPAM and speaking at a BI event in Minsk was my reason for being there. EPAM is company with a strong knowledgebase in many technical areas. They are a company of 9,000 professionals who deliver top-notch solutions and now with the purchase of Thoughtcorp they can begin to show that strength in Canada along with our team.
So back to the experience. I presented at the first <epam> BI conference held in Minsk. It covered subjects such as what is data and why is it important to how Oracle and Microsoft can support data solutions for it’s customers. It was a great time where I discussed the future of data and how important it is becoming today. Where businesses which embrace data and fact-based decision making can make a significant impact to an organization’s success. The audience was great and the questions where thoughtful and interesting. It was a great experience which I will not soon forget.
The next day was a visit to <epam> and I had the chance to talk about my experiences using Oracle and how we run data projects. I was again struck by the quality of the people I met. This was a strong team which wanted to learn more and get even better. We discussed Oracle direction around data and how we can get the most from our database investments. This is a picture of the offices in Minsk, with the lead of the Oracle Performance team, Andrei. It was a awesome experience and was a great introduction to the people and skills which EPAM bring to the market. I thank them all for letting me be a part of it all.
Of course no trip to Minsk would be complete without some comments about the food and drink. Belarus is known for some great vodkas and this trip I got to experience many choices. Straight vodka (awesome), cranberry vodka (awesomer) and some vodka made with Bison grass (a spicy, smoky awesomeness). The food was also an experience and reminded me of my childhood when my family would make very similar dishes. The potato pancakes were a throwback to the days when fried food was good for you. All of the food here is made from scratch and I am told it is all organic.
Overall, the trip to Minsk was an experience which I will never forget. The team at EPAM taught me a lot as well, it was a chance to see how a country like Belarus can rise to become a modern and technically advanced country, Belarus is a place with few natural resources but one thing they do have is a lot of smart people doing some very innovative things. It was great to experience and I look forward to my next visit to Minsk… after all I forgot my sports jacket in a bar in Minsk, so I need to go and pick it up!
Monday, April 9, 2012
The challenge in delivering data projects has always be wrought with dangers. Data projects tend to be large and encompass many aspects of the organization. As a result the time to build a complete data solution can take months and years. So often these projects found that at the end of a long waterfall-style project that the results were less than expected. The system may have hit many of the business requirements, but it missed on others, while new requirements have not even entered into delivery process. It is said that 50% of data warehouse fail, while other studies have shown this to be even higher. In my travels, I would say that we hit the target most of the time, but usually it takes longer to achieve then expected. So why should we then consider Agile?
Agile is the approach which is based on four basic values which were defined in the Agile Manifesto. These are:
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
You should note that we value both sides of the equation, but that we value the ones to the right more. These basic values provide us with basis for working in a collaborative environment which can focus on incremental working software.
How does Agile this help us to be more successful in data projects? The project approach for data which we follow at Thoughtcorp focuses on using Agile. This results in an approach which delivers the solution incrementally. By understanding the big picture of data the need of an organization we can divide a project into iterations which build upon each other while delivering working software. This is done via a prioritization process. In the Agile world this is known as Kanban Development. In this approach we continually review our priorities and check to see if the business now has new requirements. This allows the project to alter its trajectory based on real needs which are now better understood. This does result in improved project performance and a better solution for everyone.
The basic answer is that it does help. It has shown that data projects can be delivered faster and more effectively. We have seen that our productivity is increased by 27% and that defects are reduced by 35% versus typical data projects. In addition the number of features delivered was higher than anticipated. All of this resulted in a project which included over 300 data objects and 150 reports and was delivered successfully in 8 iterations. Below is our project wall looked like as we were working on things, as we had a lot of collaboration take place:
This project was one example of how Agile made data work. We learned along the way and perfected our approach. The great thing about Agile is that you make it work for your team and your projects, but you have to invest the time and effort to make it work.
And don’t forget COLLABORATE 12 is coming up in Las Vegas is just a couple of weeks. I hope to see everyone there. I will be speaking all about Big Data and how it fits into today’s data ecosystem.