1. You can use 5m. With all the assumptions you are going to make, this first approximation is not important. I recommend asking the interviewer something like "Can I use 5m to facilitate the calculation?". 99% of the interviewers will not only let you go with it but also appreciate the simplification.
2. In every market sizing, you should always consider all the possible sources of revenue/ways of using the product. In this case, you should first of all divide in 2 the problem, in order to investigate all the milk consumption:
- liquid milk
- milk-based products: dairy products (cheese, yogurt, butter...), ice cream, chocolate, biscuits...
Now you can start to assess the first bullet. Your solution scheme is pretty good, maybe the numbers are too high. I would have added some details, to transform it in a great answer. E.g. when you mention the segment of people who do not drink milk you can make some customized examples (lactose intolerants, infants, some people on diet, vegans, ...). I think this segment is quite larger than 20% (maybe something like one third?), but here you do not need extreme accuracy, so 20% will be considered fine as well. Use common sense.
Completed the first sizing, you should go with the second bullet. It is more complex since all the market mentioned should be treated separately. You could ask the interviewer whether you can make a top-down estimation (like considering the whole point counting three times the first bullet). This is a very strong assumption and you will certainly lose a lot of accuracy, but at this point, the interviewer will have already assessed your problem-solving skills and common business sense, but at the same time, you do not miss any part of the problem. You can say something like "a further effort should be spent to address all the market, starting from its bigger missing parts, like dairy products and industrial milk-based products".