no – i don't collect these. ;) however it got me thinking, since my son got interested in it by colleagues from kindergarten. it did remind me that when i was ~his age it was popular to collect cards with cars. the scheme was that you got 1 card with each bubble gum. the card was random, yet there are quite a few different cars out there. i had just a handful of these however a friend of mine had a LOT of these, which popped up two questions in my head:

- how much bubble gum can one eat/chew in 1 summer?!
- how much did he spent on these?

ad.1 – apparently a LOT. ;) ad.2 – that was beyond my skills to assess that back then… but now – we have math and we have computers! :)

regardless of what cards represent, the idea is always the same – collect the whole set, by getting random elements. so how much does it really cost? i decided to write a simple simulator of card collecting in `C++`

, with some pre and post processing in `bash`

and `gnuplot`

.

every model needs a set of assumptions. i picked a close-enough to what my son is collecting:

- full deck of cards consists of 300 elements
- cards are sold as 5-cards packs
- each pack costs 9.50 PLN (~2 EUR)
- each card is equally likely to appear in pack
- to “win”, one must collect at least 1 card of each type

since process is random by definition, i decided to run 100k simulations. each “iteration” represents 1 round – i.e. “buying” 1 pack of cards and adding it to existing set. each simulation stops when full deck is collected.

first let's check distribution, how many times did we manage to collect the entire set of cards, in a given number of iterations.

we can clearly see there's a peak of probability to end up in ~400 iterations, though if one is unlucky, it can also take >1k iterations to get there.

so the next question is – how many iterations will give me 99% chance of collecting entire set? for this, we need to integrate the previous plot.

that's over 500 iterations! so how much does it cost?

well – over 5k PLN (>1k EUR)!

on the plot above you can see there's a number of “0” points. these mean there was no data point there. it makes sense, as in early iterations you simply do not have enough cards to have a set (300/5 == 60 iterations in ideal case). around 800 iterations on the other hand one must be very unlucky to still not have a full set, so the frequency of points there also decrease.

so… why it's so expensive? it's simple – you have a lot of duplicates. the more iterations, the more duplication.

thus around 500 iterations, you will have on average 9 duplications of each card.

so collecting cards on your own is an expensive hobby. in practice however kids trade these between each other – they can swap duplicates, so that everyone gets more unique cards. let's add this to our simulation. let's assume that we act as a team, and team wins when all the players collect all the cards. simulation is then very simple, as we do not count repeated cards as duplicates, as long as we have less cards then players.

average number of repeated cards and total cost obviously goes linear with number of iterations. samples is also represented in integrated samples. we'll therefor skip redundant plots from now and just focus on integral of samples vs. number of players. here are the plots.

we now need to collect the more cards – at least `N*300`

for `N`

players. in order to get 99% success chance, we need a following number of iterations.

note that dependency is not linear. it's actually a bit less then linear. at a glance looks to be (very flattened) logarithm. that's good news!

the same trend can be seen in average cards (>1 means duplicates) per player.

here we can clearly see we're dropping cards count, as more players join the club… and less repetitions, means less expense per player!

again – we can see the same trend. for 10 players we are a little over 1500 PLN per player (compared to over 5000 PLN for a single player). there are however diminishing returns coming in fast, as players count increase. while the more the better holds, in practice it's worth pursuing at least 7 players… no – wait! in practice it's worth not to spend money on this! but if you insist, then try to get at the very least 7 other players. NPCs will also do.

here's a full table, with exact expected costs, per number of players.

players | cost per player (PLN) |
---|---|

1 | 5861.50 |

2 | 3681.25 |

3 | 2878.50 |

4 | 2453.37 |

5 | 2185.00 |

6 | 1990.25 |

7 | 1851.14 |

8 | 1745.62 |

9 | 1656.16 |

10 | 1582.70 |

just to be completely fair – while it does not make any practical sense to try to collect all the cards, the process does have benefits.

first of all – it's a social thing. :) kids learn to trade “stuff” with each other, interact, talk weight costs and benefits. altruism also pays back – it's not uncommon for a child to share a card with a friend “for free”… and guess what – that builds social bond! … and occasionally get you a free card, as well. :)

next one is about organizing cards. there are many criteria to sort by. eg:

- card groups
- numbers
- alphabetically
- …

cards will also get counted.

if child has an album for cards, it's good to organize them in a searchable fashion. my son decided that it's easy to find cards, when they are sorted by numbers. PERFECT! :D however he was trying put all the cards by reading number off of each card… and counting from 1st slot to that number, to find a position to put the card into. yes – that's `O(N^2)`

the hard way (remember – there's ~300 cards!). when i saw this, i proudly taught my son merge sort – he's very first algorithm! :D now he sorts the cards and inserts them in order to album, so that usually no more than a few cards slots needs to be counted – this we're `O(N*log(N))`

for sorting plus `O(N)`

for inserting. nice…

as a free bonus – it's now super easy to detect duplicates. :)

oh – and btw! since cards are now ordered by numbers, we can use binary search (his 2nd algorithm!) for locating a given card in the album… or detecting it's not there yet and thus worth to trade.

one last thing – cards are a good playground for managing pocket money (see post above).