How Big A Collection Does One Need?

Coins, while a medium of exchange, can also be an object of beauty.

My Uncle Tom Cooper was an avid coin collector. He imparted an interest in coin collecting to myself and my bother when we were young boys. Uncle Tom focused on the usual things, mint sets, dollar serial sheets, historical coins, eventually assessing quite a nice collection. My brother renewed my interest in coin collecting with the “America the Beautiful” quarter series. When combined with my spare change from various international trips, I am technically a “coin collector”. (You can’t spend coins from other countries in the U.S., such as my few pre-Euro French Francs, etc., so there is nothing to do but keep them in a bag!) But my effort pales when compared to my Uncle’s collection.

There seems to be no answer regarding how many coin collectors exist. Someone who collects bicentennial quarters or wheat pennies enjoys coin collection. One could argue that anyone with change in their pockets, a jar by the laundry, etc., becomes a coin collector. A collector’s goals could include storing coins found in spare change to acquiring historical, rare, or valuable coins. In all cases, the collector receives some value beyond the collection’s value, such as learning about the desired objects or enjoying its intrinsic beauty. (To the data scientist, would this be considered metadata?)

So the question remains: what makes a collection, including how to value it or determine its minimum size? In all cases, it is not the number of objects that makes something a group, but rather the owner’s interest that genuinely creates value. That is why I was fascinated by the following story (Escape Pod 746: The ’76 Goldwater Dime). The protagonist’s connection is smaller than mine, but he struggles with anyone believing his collection’s validity. He perceives his collection has value but questions if others would share his interest or understand the collection itself. But ultimately, he is happy with his collection, regardless of what others believe.

This led me to consider the following graphic: how big does a collection need to be? For example, the world is full of collectables, such as coins, but also books, dolls, bicycles, etc. There is no end to what one may collect, however both time and money, as well as our interest, limit us to consider what we will collect. Furthermore, we are limited to what we wish to collect by considerations of acquisition, storage, etc., unless of course we become the stereotypical hoarders.

As an economist, I believe people are happiness-seeking creatures, so any collection (that is legal, etc.) becomes a good thing, regardless of the collection’s size or market value. For if beauty exists in the eye of the beholder, the creation of beauty benefits us all.

December 29, 2021December 29, 2021

Investing in the 4’s I’s: Infrastructure, Information, Institutional, and Intangibles

In 2006, I made this presentation at UNCTAD on Information and communications technologies (ICTs). I thought I would post it now, as many of the same issues from 15 years ago remain relevant today.

Supporting the growth of international trade requires providing safe and secure borders that do not overly burden shipments moving through those same facilities. First, the world is becoming more integrated through telecommunications and transportation: a topic expressed through many UNCTAD documents over the past few years. This interdependency between regions indicates the growing need to improve the facilities that handle trade and information exchanges. I will try to discuss ICT in the context of the four broad areas: Infrastructure, Information, Institutional, and Intangibles.

Infrastructure

Generally, this is the first area people consider when handling international trade. It is the most important – without facilities to receive and work cargos, trade will be impossible. We can point to many facilities throughout the world that are underutilized for several reasons. One of these failures involves understanding location – trade facilitation usually involves a three-mile section in a three thousand mile movement. Carriers and shippers benefit when the goods move – there are no “roadblocks”. Significant delays can occur at transactional points, borders, and ports. There is a disconnect regarding using a facility and developing a facility for future use. There are problems with planning for infrastructure – generally, it involves a “lumpy” large-scale investment or incremental improvements to an existing structure. This “either/or” investment approach underlines the fact that infrastructure improvements require the spending of real funds – funds generally secured by either the public sector planning process or private sector investment. The need exists to service growing traffic, but physical constraints may limit expanding existing terminals quickly to handle unexpected cargo growth. This inability to move funding to improve physical capacity raises concerns that existing border crossings or ports\terminals may become practically obsolete before becoming physically outdated as the larger ships and operational changes leave facilities and networks unable to service surging demand.

Regarding ICT, this can result in additional challenges – such as the size and location of offices, the ability to maintain the facility once developed, and the roads and other networks that support the facility. Improving system performance may become a problem in the United States. According to a survey done by the Federal Highway Administration seven years ago, the worst roads on the national highway system were the small connector roads linking ports. Intelligent Transportation Systems and driver notifications can improve operational patterns in a facility or on local roads. Also, the Federal Highway Administration conducted several studies to develop simulation models of border crossings and performance measurements for operational improvement. Other studies sought to examine the travel time associated with specific border facilities. Finally, I worked with Transport Canada and the American Trucking Research Institution on using satellite technology to develop travel time measures for trucks. By understanding the nature of traffic around a port or border crossing, one can create a systems approach that would allow greater driver notification beyond movement to the first checkpoint to the entire system.

Information

This area has seen some exciting changes in the past few years. Telecommunications and computer technologies have resulted in people expecting data and information to be available on a real-time basis. The private sector shippers and carriers have developed this information not to share with local ports and border crossings but to capture benefits from controlling costs and inventories. Furthermore, the development of 3^rd party logistics firms and the resulting integration of shippers and carriers in shipment decisions have resulted in additional operational gains while adding more complexity to the system.

At the same time, the development of automated customs manifest and shipper notifications has been developing. In the U.S., the movement to a single Automated Manifest system has been a tedious process, given the coordination that involves over 70 agencies on data regarding international shipments.

Operational information sharing between the public and private sectors remains a growing need. In Southern California, the Pier Pass system has successfully reduced truck delays at container ports. Along the U.S. – Canadian border, expedited shipments are now available to shippers and carriers approved by U.S. Customs. Several electronic tracking options should be considered to improve the productivity of the navigation system. For inland ports, the Smartlock system, being evaluated by the Port of Pittsburgh, utilizes Geographical Positioning Systems to locate representative points on the towboats and barges. By using GPS technology, a pilot could move a barge through a lock or channel, even during times of limited visibility.

Institutions

In the developed world, we are moving from the age of expansive infrastructure construction to the age of maintenance and institutional partnerships. In some regards, developing countries have jumped over the developed world in seeing ways to build public-private partnerships to improve trade facilitation. But in both cases, the institutional barrier to implementing changes to ports and border crossings exist. There are no single entities responsible for freight movements at ports and borders. At what level should the private sector discussion occur: at the port, the drayage operator, the shipper, or the carrier? What is the public sector role: is it defined by the port, customs authorities, local departments of transportation, and other federal or state agencies? Each group needs information on activities at the port or border crossing in some areas but at different time frames and scales. For example, the private sector is examining events within the context of a few days – such as securing the necessary documents or the trucks to move the cargo to or from a border crossing. The public sector responds to goods already in transition or very long-range planning activities. Although standard features exist, the specific information needed for these different levels is not the same.

I hate stating the obvious –ports and borders are geographic entities. They cannot respond quickly to changing national or local policies and can not simply move to take advantage of opportunities. For example, in the early 1990s, California decided to remove the tax exception for bunker fuels. This tax led to a dramatic loss of bunker business and changed costs associated with charter movements along the west coast. Although the tax was later repealed, the industry adjusted to alternative sourcing options, resulting in a loss of revenue to the bunkering industry.

Furthermore, the potential challenge of locally active participation by other groups concerning current and future use of specific facilities may change a facility’s competitive position. While trade facilitation seeks to improve cargo movement through a facility, in some cases, local groups have sought to reduce traffic because of concerns over externalities such as traffic congestion, noise, and air emissions. This potential disruption from other local or national policies should be considered, especially if addressing security concerns may offset improving ICT.

Intangibles

Port and border crossing planners and operators must not assume that building a facility will guarantee its success. Other factors outside of the ICT framework can shape the ultimate success of a gateway facility. In the past, transportation was associated with production decisions, while networks developed around production and consumption regions. In the new global business paradigm, low-cost technology and flexile production and logistics support have changed investment in plant and equipment from a more long-term framework to simply being five to seven-year assets. With the potential for rapid turnovers in production locations and operations, the importance of linking transportation and economic growth remains even more critical.

In this new global network, transportation is geographically blind. Ports and borders are now interchangeable links in the system, not a separate component of transportation activity. An example involves the U.S. West Coast. In 2003, failed labor negotiations led to a coastal shutdown along the U.S. West Coast. Since then, shippers are becoming increasingly concerned about controlling transportation costs and system reliability. If a shipper feels that one port range is too crowded or that problems exist, that shipper may switch to either another carrier or port for some or all shipments.

While not necessarily an ICT function, the Corps of Engineers is developing a suite of products to examine the interchange between national and international policy, national planning, and specific operational and planning models. The Navigation Economic Technology System (NETS) seeks to explain how the system relates to itself. One of these tools is the development of the Regional Routing Model, which aims to develop an economic model of multiport relationships either within a national context or within specific trade corridors. Other tools include a global grain model of trade flows that can be used for policy analysis and simulation models for harbors and inland navigation. The NETS program will release these tools into the public domain once they are completed and hopefully show the interaction with freight facilities and the need for nations to consider these facilities in the context of other policies.

Tech Transference to Developing Countries

In viewing developing countries’ infrastructure needs from some distance, the potential exists for transference to the developing countries under certain circumstances. These considerations include recognizing that the developing world does not have the same infrastructure and institutions as in developing countries, which means the level of funding or coordination is much different. We should not necessarily hold developing countries accountable to the same standard of operational activity within a short time once a technology transference occurs. Finally, the developed world needs to approach the developing countries as peers and commit to long-term training and personnel development.

Conclusion

Improving transportation means so much more than it did fifty, twenty, or even ten years ago, incorporating concerns over flexibility, improving operations, and positioning for handling uncertain traffic forecasts. Ports and borders must be more accountable to the carriers, shippers, and other groups. New approaches that balance infrastructure, information, and institutional changes may be necessary to ensure that trade remains a critical component to sustain economic growth. Generally, these discussions on needs tend to focus on the “infrastructure” question. How do we build the system, and what do we need to spend to make it work? These questions are essential to answer, but the most significant gap may remain the intangible: changing how people comprehend the value of borders and ports and how to improve capacity both now and in the future.

When I gave this speech, I was so nerovus I rattled off too fast, leaving the interputers behind. But as a few things, such as the NETS program, have stopped, the work on integration, modeling and technology has continued. It will be interesting to revisit this speech in another 15 years to see what changed!

November 3, 2020February 1, 2022

2020 The Asterisk Year

We tend to think in nice round numbers, such as fives, tens, hundreds. Despite being a nice round number, 2020 will always be the year with the asterisk.

Researchers will seek to account for the social, economic, and political events of the year by assuming 2020 can be “normalized”. This is too simple a concept. If the economy can be represented as a factory that can be stopped and started, then concerns over 2020’s prospects are unfounded. However, this ignores the many activities that require multiple years to complete, such as capital programs, public services, or other planning and permitting activities. The challenge will be to see how activities with longer horizons perform during 2020. It may be many years to get to the new “normal”.

January 11, 2020

Data as a Model – Football Yardage

Data is an abstraction or a physical activity. When describing data we are measuring one element that may actually have multiple variables that influence its outcome.

On Monday, the LSU Tigers will play the Clemson Tigers for the College Football National Championship. Before, during and after the game, reporters, fans, and announcers will compare many metrics. They will discuss turnovers, first downs, penalties, etc. but the most common statistic (beyond the score) will be offensive yards. Offensive yardage represents many things: the quality of the offensive line (or its lack of execution), each coach’s play-calling, and the quality of the quarterback/receivers play. For example, Joe Burrow’s highlight against Georgia represents a 71 yard pass and that is all. The duration of the play, etc., are compressed into one small data point.

1st & 10 at LSU 20

https://www.espn.com/college-football/playbyplay?gameId=401132981

(3:57 – 3rd) Joe Burrow pass complete to Justin Jefferson for 71 yds to the Geo 9 for a 1ST down

So, when you are watching the game, remember the announcers often describe an action by a single variable, one which is influenced by many things. And for some items, “data” fails to describe the variables that create these memorable moments.

GEAUX TIGERS!!

October 30, 2019

Eating Chocolates and Performance Metrics

We have all seen or heard this quote from Peter Drucker.

https://i0.wp.com/www.stonevp.com/wp-content/uploads/2016/01/measure-control.png?resize=303%2C227&ssl=1

The focus on performance is a byproduct of a data rich world. Deploying “the internet of everything”, provides the ability to improve system performance at a greater degree of granularity if we all can agree upon the desired outcome.

A fan of slapstick/physical comedy, I always enjoyed this skit. Lucille and Vivian are unable to keep up with their chocolate wrapping assignment. They eventually “hide the evidence” that the system is failing, as their confidence turns to panic. (The woman manager actually created a perverse incentive, i.e., no unwrapped chocolates. To avoid being fired, they actually do a worse job than being truthful about their work, or the manager observing to see if they were preforming as expected.)

The manager saw the chocolates were gone. She was delighted, but did not understand the system’s real performance. One could argue that her measurement tools were weak, but her eyesight was sufficient to allow her to believe that no other testing was necessary, the objective was met, no unwrapped chocolates in the other room. Lucille and Vivian do not confront the manager. Their mouths are full of chocolates, thus agreeing to be overworked yet again.

So, when examining ways to manage performance measurements, industrial processing does a good job of discussing flow charts, etc., but it may not necessarily capture the ingenuity of the work bench! And this is where the second Drucker quote serves as a useful counterpoint.

Lessons In Mentorship From Peter Drucker - Credera

But there may be a better quote… “just remember performance measures are like a box of chocolates.”

Forrest Gump Quotes About Running. QuotesGram

August 9, 2019

Book Review – Everybody Lies

Summary: “Everybody Lies” http://sethsd.com/everybodylies was an enjoyable, fascinating book describing how understanding metadata about internet searches can provide information concerning people’s “true feelings, emotions, or opinions”. The book assumes people are more honest when they are anonymously seeking information. Reviewing those searches in aggregate provides information that social scientists may be unable to collect in other formats.

The Main Arguments

Researchers struggle to understand people’s behaviors, needs and their true opinions. In Part I, Data, Big and Small, the author outlines the need to frame social science research based on understanding big and small data. Using his grandmother’s dating advice was a great example of using Big Data (page 25). But there are cautions here, for we can pick and choose what observations we use in making those conclusions.

People will “lie” to researchers for many reasons, such as not expressing their true feelings to avoid judgement by the researcher. In this case, the use of internet searches, often done in private, can provide a way to better estimate broad trends concerning how people understand the world. The main section of the book, Part II, the Powers of Big Data, illustrates the disconnect researchers face when researching topics such as Sex, Hate and Prejudice, Internet, Child Abuse and Abortion, Facebook and Customers. Each topic gets an introduction concerning what people have studied, and how using internet search information can confirm, deny, or provide new insights into the topic.

Throughout the book, there were cautionary tells that having more data may not generate more/useful information or that not every belief can be quantified through the data. His discussion criticizing studies that would find “most Knicks fans live in the New York area” are useless. In Part III, “Big Data: Handle with Care“, the author begins the real discussion: big data can be a boon to good governance and addressing social needs. But the real caveat is that such needs may not be in everyone’s self-interest. There are questions that having more data could introduce more errors, such as Dimensionality, where the odds of finding a correlation between two elements is increased simply because there is just more data to find possible correlation.

Methodology, Evidence, and Context

The report was not an analytically oriented book, but the charts and tables were helpful in illustrating how we “lie to ourselves” when we consider our public disclosures (Facebook posts) compared to our private searches. I went to Google Trends to test a few searches, and it is a useful proxy concerning people’s interest in a topic by time and geography. The book presented, and footnoted, many studies, showing the author’s thoroughness, and would be a useful first document for additional research on some of these topic areas.

Style

The book’s context and layout were very accessible, and the stories engaging. While I would have enjoyed seeing even more tables, charts, etc., such would have reduced the effectiveness of the work (and I could look them up with the references!) There are some graphics in the Ted Talk, which I found very helpful.

Final Takeaways

I enjoyed the comparison between himself and his brother regarding baseball. I am not a baseball fan, but my father loved football. Cultural references do shape experiences in ways we do not understand when we were children, but these items influence our adulthood’s tastes and desires.

I thought the best part of the whole piece was Chapter 8, Mo Data, Mo Problems? What We Shouldn’t Do, (especially after the A/B testing sections- scary that we are so easy to manipulate!) With more data, comes the assumption that “we” can do more. But does more data mean we have more actionable items, or do we simply have more confusion when making choices. The author mentions the Minority Report, the movie. When discussed in this context, the original story written by Philip K. Dick is even more horrific, as other PreCogs pick up the story at different points. Based on concerns with big data, there exist more ethical challenges that remain to be addressed concerning ownership of our physical and online identifies.

Finally, I liked the honesty of the “conclusion challenge”, especially after mentioning how Freakonomics influenced his professional interest in data research. Seth, if we ever meet, I will buy the first round in celebration of your success in writing such an accessible, fun, and most importantly, insightful book.

August 4, 2019May 27, 2021

Don’t Tell Me!!!

Wise men don’t need advice. Fools won’t take it.
–Benjamin Franklin

The older I get, the more I see this message true. It is easy to assume we are all experts. For a researcher, this is not a good attitude. We all know the one way to do any research activity (process, data, approach, etc.), but in doing so, we often forget the joy that comes from learning something new. It is in that learning, based on recommendations, comments, critiques, etc., that we grow as researchers. But it is in the teaching to others where we learn more.

Photo by Priscilla Du Preez on Unsplash

August 4, 2019

Why Learning to “Question The Question” Matters

Edit

As a young researcher working at the Port of Long Beach, I answered requests generated from the port staff. (As my time in Long Beach occurred before the Internet became the “knowledge search tool”, I had to understand what people needed and why they needed the information!) After plenty of, “This is not what I need”, “I wanted it like the report you did a year ago”, “how much do you spend on data purchases”, I realized that it was not only understanding their “question”, but knowing what intelligence they needed. So, I asked questions about their request (sometimes the light bulb takes a while to come on…). Surprising, once I took the time to question the requester, the better the research (more timely, focused, etc.) became. (There was a great discussion on the importance of questions by Hal Gregersen on “The One You Feed” Podcast.)

Disclaimer: The following assumes these are internally generated questions. While the same approach could be used for evaluating service consulting requests, there exist other program elements one would add beyond these questions.

The questions fall into four broad categories: Institutional, Skills, Costs, and Review. The Institutional category links the inquiry to the organization’s goals and values. One could argue these are the most important to know, for they outline what is expected, but I would argue they are not the only thing to assess. The Skills category is a self-determination about your ability to provide the answer, while Costs outline what (if any) additional resources may be needed. Finally, the last category is Review, i.e., what can I do better/different in my current work activities based on this request. (Rearranging the 4 categories results in RISC, an appropriate reminder of the possible consequences of bad/misinformed research.)

Institutional: The objective is to provide timely intelligence to support the organization’s mission. In many ways, knowing the right answer but for the wrong question does not help anyone, and researchers must guard against our own biases concerning what we think someone needs. I had to learn to ask the following questions:

Who needs this,
Who asked the question,
When do they expect an answer,
What are their expected outcomes (and by when),
Can you repeat their inquiry back to them in a clear, concise manner,
Will this require an internal review, and if so, who would do that work,
Will this intelligence be used internally or externally,
Who will review this work,
How important is this request when compared to other requests,
Into what format do you want the report (chart, text, etc.)
Is this question related to some legal request, requiring documentation, or following specific guidance goal,
While this require a presentation/training on my part when completed,
What level of confidence are they willing to accept, which can range from a rough guess to a high degree of confidence?

Skills: In many ways, this is the hardest category to consider, for one must be honest. Without this assessment, the researcher may needlessly expose themselves to having their work deemed less than acceptable over time. Some questions may include:

Do I have the time,
Do I have access to the data to complete the task,
Do I have the software/skills to complete the task,
Do I want to do this research,
What happens if I don’t do this,
Is this like previous questions I (or others) have answered in the past,
Can you repeat their inquiry back to them in a clear, concise manner,
Can someone else answer this question better than me,
Do I have the domain knowledge to understand the topic,
Do I need a collaborator,
Do I need some training to answer this question?

Costs: Sometimes there are costs associated with doing business researcher. Not all data is accessible in the format one needs, nor, as people believe is all information “free” on the internet. The researcher must understand the resource costs, but these may matter little to the person who generated the inquiry!

Do I need to purchase data/information services,
Do I need to get a license or right to access the data,
Do I need to purchase software or hardware,
Do I need to hire a consultant because I do not have the skills time or energy to complete this project is anticipated format,
Can I legally share this data, or does it have to be summarized, etc.?
Do I need to pay for training to respond to this request?

Review: After the work is delivered, sometimes it is helpful to review with the inquirer to understand how your research met their needs. And for any professional researcher, this is an ongoing query regarding “do I have the right knowledge to do my appointed tasks”. These questions may include discussions such as:

Will I be asked similar questions in the future,
Do you want to yourself/others to access this information directly without asking me,
Do you need training to access the data themselves,
Do you or I need more domain knowledge,
Did the information satisfy our organization’s needs?

So, what did I do once I better understood internal needs?

After a while, I started to see where most questions centered around “who was doing what where” and “were they successful”. Knowing most questions focused on certain topics, it was easy incorporate those queries into my ongoing data/market research activities. Ultimately, this lead to the development of the Port’s first maritime data mart by integrating PIERS into Oracle with many long-forgotten programs (such as Paradox and Brio). The datamart, using various scripts, generated quarterly market reports for Senior Staff. The information also provided specialized research studies for current or potential clients of the port concerning market patterns.

But people do not “understand the value of information”, something every researcher laments. When I was at the Port of Long Beach, Don Wylie, my boss, instructed me to include on every report “the data was developed by the Trade and Maritime Services using PIERS data”. The following year, there was no debate concerning renewing the PIERS data purchase, nor the value that the Trade Office provided.

In sum, asking the right questions, through a structured approach, can illuminate everyone’s expectations. This should result in more successful projects, while demonstrating the value of a robust internal research mission.

June 20, 2019September 14, 2021

The Master of Your Domain

We talk about others being a legend in their own mind, although we like to think we are “Masters of Our Domain”. When it comes to data and analysis, that domain may not be a physical space, but the information and intelligence one manages/controls. For example, my background has focused on ports, transportation, and freight movements, resulting in my domain knowledge regarding international trade.

But there is more than simply being the Master of One’s Domain to be a solid researcher. One has to know how domain knowledge can shape a research question.

Let’s look at this exchange from “Monty Python and the Holy Grail “, where the troll asks three questions. One of the questions is fairly complicated. The King asks for clarification, based on the domain knowledge gained earlier in the film from two soldiers who possess the specialized knowledge of swallows.

The question concerning the average airspeed velocity of an unladen swallow may only interest researchers examining the physics of avian flight (or Monty Python fans here and here). But having learned something about swallows earlier, the King knew enough about the domain to ask for clarification (in this case, to delay), by asking about another data attribute.

Regarding the query, the question of the average airspeed reflects a question concerning a specific data element, but the second question was based on another attribute, namely the type of swallow. For most researchers, knowing that extra bit of information may make the difference between good research or great research, or in this case, who lives or dies. So, there remains a benefit to being the domain master, as King Arthur reminded Bedevere as they cross the bridge, but only if one learns not only new data but how to apply that information.

May 13, 2019

What If the Horseshoe Falls Off?

There is the old nursery rhyme about how a kingdom is lost because a horseshoe falls off. The poem refers to paying attention to little things that can make a difference, as the casual relationship of minor things failing can evolve into major problems (the Space Shuttle Colombia is but one of many examples). While one could argue its importance on military logistics or other more mundane tasks (such as learning the basics when mastering any skill), the same logic could be applied to not only the development of data but to data applications.

In the age of “Big Data”, we see where more information can provide insights that were unavailable just five years ago. The use of Artificial Intelligence and Machine Learning will transform how we collect, manage and process data, providing insights that will assist researchers and decision makers. However, the casual relationships between collecting/using data with any unintended consequences remain.

For example, one could argue that I represent three people: a physical me who eats, sleeps and walks around, while there is a legal me, who signs legal documents and has financial interests. There is an emerging digital me, where I live and work in a virtual world. My information is collected, processed, and analyzed, as I become “a product” sold to others. In many ways, the data collected from millions of digital actions are creating better horseshoe nails for business, governments and others, but will this lead us to lose the kingdom of our individualism?