A couple of weeks ago, Bradford Cross of FlightCaster posted in Measuring Measures that transactions are the next big data category. I argue that they already are, and from reading his blog post, he seems to suggest this as well but I will admit that I think I missed his point. There are some clear examples of transactions and their importance:
- Itemset Mining. Cross discusses this in his article. Financial transactions on sites like Amazon contain items (merchandise). Using these transactions, Amazon built a recommendation engine to recommend new items to customers on their website, and even customize deals for customers via email and on the site.
- Wireless Localization. Fantasyland at The Magic Kingdom in Walt Disney World was to undergo a big overhaul to provide a personalized experience on transactions through the park. An RFID chip would be included in a ticket (or some type of document) and the visitor’s information from a survey would be transmitted to the attraction’s intelligent system. Such a system would also provide Disney a wealth of information about what attractions certain audiences visit, when, how often, and even what items a visitor may purchase during the day.
- Website Conversion Path Optimization. A visit to a website consists of a series of lines in an Apache web log, and if done properly, these lines (called “hits”) can be mapped to an individual user (with a cookie, or some ID) and an analyst can view how visitors are navigating through the website. Then, an analyst may make a recommendation to an Engineering team to modify how visitors are ushered through the site to maximize revenue or conversions.
- Resource Allocation. One of the many applications of RFID technology is in location and optimization of hospital equipment and the paths they take through the hospital.
Transactions can be analyzed using several different statistical techniques.
- When the outcome, or response variable is some continuous, real valued measurement, and the explanatory variable is time, then time series analysis can be very useful to determine whether or not there is a trend in the data and what it may be. With respect to transactions, some examples where time series analysis may be useful is in regularly repeated purchases, or month credit card balances etc.
- If the response variable is simply presence/absence of some attribute with time as the explanatory variable, use a temporal point process or a time-to-event model. Some non-transactional examples of temporal point processes are earthquakes (fixed location) or utterances of a particular word. Transactional temporal point processes may include an individual’s purchases of a particular chemical (counterterrorism), times of login on a particular machine etc.
- If the response variable is continuous and real and the explanatory variable is space/location, then kriging (for prediction) or a geographic/map-based visualization (for description) would be appropriate. Transactional spatial data include amount won/lost at slot machines in a casino by location.
- If the response variable is presence/absence of some attribute with space as the explanatory variable, use a spatial point process. An example of a transaction spatial transaction would be key swipes throughout the day.
- If the explanatory variables include both time and space then a spatio-temporal point process would be appropriate for presence/absence data, and perhaps a surface analysis or multivariate kriging method may be appropriate.
In the past month, I have visited Nevada casinos twice: Las Vegas and Laughlin. We would like to believe that “slot” machines are random (albeit biased towards house advantage obviously) and anonymous. The fact of the matter is that your visit to a casino is simply one large transaction. The data generated from such transactions can allow casinos to skew machines even more in their favor. Note that this is just my opinion, and there could very well be Nevada state laws barring usage of such transactional data for skewing games in a casino’s favor.
Casinos provide perks for their frequent visitors and players via their “rewards program” or some other promotional program such as free meals, free rooms, free show tickets etc.. These programs issue a card with a magnetic stripe. By shopping at the resort, or dining in their restaurants, the resort keeps track of purchases and rewards points based on their value. These cards are also inserted into slot machines, and swiped at card tables, as another major way to earn points. Each time this card is inserted, the magstripe is decoded and presumably information about the player is transmitted over an internal secure network. A server listening on this network can know exactly when money is inserted into the machine by the user, how often, and how much. Casinos could use the amount of money inserted into the machine to adjust probabilities of winning each pot to rig towards higher bets. Casinos can also use win history to globally adjust the conditional probability of winning certain pots on machines across the casino.
Next, starting some time in the early 2000s (maybe even the late 1990s) machines no longer accepted or awarded hard money; instead, they reward and accept printed cash vouchers. Each voucher has a bar code and serial number printed on it. To the layman, this may simply represent the amount of money to be awarded when cashed in, and when it expires. I suspect these tickets are used for the same purpose as the casino card. Although each voucher has a different serial number, there may be a “cookie” encoded in the bar code with some unique transaction ID number. Even without such a cookie, tracing the transaction is simple as the serial number on the inserted voucher can easily be matched to the serial number on the printed voucher at the end of the game in a database on network server. With cash vouchers, transactions begin with the insertion of hard cash, and ends when the gamer has either spent all of his money on the game and leaves, when some time period has passed, or when the user leaves the game and never uses the voucher in another machine. Some people may be correct in their habit to “cash out” and then deposit fresh money into the machine.
By using cash vouchers and casino reward cards, casino owners can track gamers throughout the casino, throughout a small time window. This information can be visualized as a graph, or a map, and they can then optimize placement of machines to maximize bets and loss, and minimize payouts.
This suggests an experiment that may require several people in the same casino:
- On several occasions, spend a few hours playing different slot machines with and without a rewards card and compare winnings/losses.
- On several occasions, spend a few hours playing different slot machines and use fresh currency at each machine. Do not use the cash vouchers in the machine. Then, compare winnings/losses.
Just some food for thought.