TESTING METHODOLOGY


The methodology of the umlaut connect Mobile Benchmark is the result of more than 15 years of testing mobile networks. Today, network tests are conducted in more than 80 countries. Our methodology was carefully designed to evaluate and objectively compare the performance and service quality of mobile networks from the users’ perspective.

umlaut Logo.png

The umlaut connect Mobile Benchmark Spain comprises of the ­results of extensive voice and data drive­tests and walktests as well as a sophisticated crowdsourcing approach. 

DRIVETESTS AND WALKTESTS

The drivetests and walktests in the Netherlands took place from November 6th to November 14th, 2020. All samples were collected during the day, between 8.00 a.m. and 10.00 p.m. The network tests covered inner-city areas, outer metropolitan and sub­urban areas. Measurements were also taken in smaller towns and ­cities along connecting highways. For the drive
tests, umlaut, used four vehicles. The connecting routes between the cities ­covered about 3,000 ­kilometres. In the cities, the test cars drove about 2,180 km and in towns approx. 580 km.
In addition to the drivetests two walktest teams took measurements by foot – visi-ting so-called “areas of interest“ with a strong visitor frequency like train stations, airport terminals, coffee shops, museums and also local public transport. Part of the schedule of the walk tests were also rides on long distance trains.The combination of test areas has been selected to provide re­pre­sentative test results across the Dutch population. The areas selected for the 2020 test ­account for 6.2 million people, or roughly 36.3 per cent of the total popula­­tion of the Netherlands. The test routes and all visited cities and towns are shown in the results section of this ­report. 

The drivetest cars and walktest teams were equipped with arrays of ­Samsung Galaxy S10 and S20+ smartphones for the ­simultaneous measurement of voice and data services.

VOICE TESTING

One Galaxy S10 per operator in each car was used for the voice tests, setting up test calls from one car to another. The walktest team also carried one Galaxy S10 per operator for the voice tests. In this case, the smartphones called another Galaxy S10 as a stationary counterpart. The audio quality of the transmitted speech samples was ­evalua­ted using the HD-voice capable and ITU standardised so-called ­POLQA ­wideband algorithm. All smart­phones used for the voice tests were set to VoLTE ­preferred mode. 

In the assessment of call setup ­times we also rated the so-called P90 value. Such values specify the threshold in a statistical distri­bu­tion, below which 90 per cent of the gathered values are ­ranging. For speech quality, we published the P10 value (10 per cent of the values are lower than the specified threshold), because in this case higher values are better.

In order to ­account for ­typical ­smartphone-usage scenarios during the voice tests, background data ­traffic was generated in a ­controlled way ­through ­injection of 100 KB of data traffic (HTTP downloads). We also evaluated the so-called ­Multi­rab (Multi ­Radio ­Access Bearer) ­Connectivity. This value denominates whether data connectivity is available during the phone calls. The Voice scores ­account for 32 per cent of the total results.

DATA TESTING

Data performance was ­mea­sured by using four more ­smartphones in each car – one per operator In Cars 1 and 4, this was another Galaxy S10, set to 4G preferred mode. Cars 3 and 4 as well as the walktest team carried one Galaxy S20+ per operator, set to 5G preferred mode – enabling 5G connectivity wherever available.

For the web tests, they accessed web ­pages ­according to the ­widely ­recognised Alexa ranking.

In addition, the ­static ­“Kepler” test web ­page as ­spe­cified by ETSI (Euro­pean Telecommunications Standards Insti­tute) was used. In order to test the data ­service performance, files of 5 MB and 2.5 MB for download and up­load were transferred from or to a test ­server located in the cloud. In addition, the peak data ­per­formance was tested in uplink and downlink directions by assessing the amount of data that was transferred within a ­seven ­seconds time period.

The evaluation of ­YouTube playback takes into account that ­YouTube dynamically adapts the video ­resolution to the avai­l­able band­width. So, in addition to ­success ratios and start times, the measurements also ­determined average video ­resolution.

All the tests were conducted with the best-performing ­mobile plan available from each operator. Data scores ­account for 48 per cent of the total results.

CROWDSOURCING

Additionally, umlaut conducted crowd-based analyses of the Spanish networks which contribute 
20 per cent to the end result. They are based on data gathered between mid-May and the end o­f October 2020. 

For the collection of crowd data, umlaut has integrated a back­ground diagnosis pro­cesses into 800+ ­diverse Android apps. If one of these applications is ­installed on the end-user’s phone and the user authorizes the background analysis, data collection takes place 24/7, 365 days a 
year. ­Reports are generated for every hour and sent daily to umlaut‘s cloud servers. Such ­reports ­occupy just a small number of bytes per mes­sage and do not include any ­personal user data. This unique ­crowdsourcing ­technology ­allows umlaut to ­collect data about ­real-world ­experience wher­ever and when­ever customers use their smartphones.

QUALITY OF BROADBAND SERVICE

For the assessment of network coverage, umlaut applies a grid of 2 x 2 km tiles (so­ called evaluation areas or EAs) over the test area. For each tile, a minimum number of users and measurement values must be available. In order to ­assess the Coverage Excellence, umlaut awards one point if the considered network provides 4G or 5G cove­rage in an EA. Another point is awarded to a candi­date for each competitor who provides a smaller or no share of broadband usage. In a country with four contenders, a candidate can thus reach up to four points per tile: one for broadband coverage and three additional ones for “beaten“ competitors. The assessment then relates the obtained points to the total possible points for Coverage Excellence.

In addition, we consider the Time on Broadband. It reveals how often a single user had 4G or 5G recep­tion in the observation ­period – independent from the EAs in which the samples were obtained. In order to calculate this, umlaut puts the number of samples with 4G/5G coverage into relation to the total number of all samples. Coverage Excellence and Time on Broadband results each provide 50 per cent of the points for the Quality of Broadband Service. Important: The percentages determined for both parameters reflect the respective degrees of fulfillment. They do not correspond to the percen­tage of 4G/5G coverage of an area or population.

DATA RATES AND LATENCY

Additionally, umlaut investigates the Data rates and Latencies that were ­actually ­available to each user. The examination of these ­parameters is inde­pendent from the EAs and thus concentrates 
on the experience of each single user. Samples which were for instance obtained via WiFi or with the smarthphone‘s flight mode being active, are filtered from the data pool before further analysis.

In order to take the fact into ­account that many mobile phone tariffs limit data rates, umlaut has de­fined speed classes which are corresponding to particular applications: For Basic Internet,
2 Mbps are sufficient. HD Video ­requires 5 Mbps. And for UHD Video the ­minimum is 20 Mbps. 
In order for a sample to count as valid, a ­minimum amount of data must have been transmitted within a 15 minute period. The same ­principle also applies to the assignment of a data packet‘s 
latency to the according appli-cation-­based classes: Roundtrip times up to 100 ms are suffi­cient 
for OTT Voice, 50ms and faster qualify a sample for Gaming.

In the assessment, umlaut assigns the data rate and latency observed in a sample to one of these perfor­mance classes. Then, Basic Internet accounts for 60 per cent of the Data
Rate score, HD Video for 30 per cent and UHD Video for 10 per cent (see table on the right-hand side). The ­La­tency score incorporates OTT Voice with a share of 80 per cent, Gaming with a share of 20 per cent.

umlaut‘s fleet of test cars is equipped with up-to-date test smartphones. The phones on board are operated and supervised by a unique control system.

umlaut‘s fleet of test cars is equipped with up-to-date test smartphones. The phones on board are operated and supervised by a unique control system.

Drivetest_01_2020-9186.JPG
One Samsung Galaxy S10 per operator was used for the voice measurements and another Galaxy S10 for half of the data measurements. In the second car and in the walktest team‘s backpack a Galaxy S20+ was used and set to “5G preferred“.

One Samsung Galaxy S10 per operator was used for the voice measurements and another Galaxy S10 for half of the data measurements. In the second car and in the walktest team‘s backpack a Galaxy S20+ was used and set to “5G preferred“.

The umlaut staff analysed hundreds of thousands of measurement va- lues during and after the tests.

The umlaut staff analysed hundreds of thousands of measurement va- lues during and after the tests.

 
Crowdsourcing Score Model.png
 
Donut_Data_Voice_Crowd_v2.png

ConclusioN

T-Mobile wins for the fifth time in a row with the grade 
“outstanding“. KPN ranks second, also achieving the ­grade “outstanding“ and leading in the Data discipline. In addition, KPN receives connect‘s Innovation Award for the best overall 5G performance. ­Vodafone ranks third with the grade “very good“ and leading in the Crowdsourcing category. 

The overall winner of the umlaut connect Mobile Benchmark in the Netherlands is T-Mobile – for the fifth time in a row. Also, T-Mobile scores best among all three competitors in the Voice category. KPN follows on second place at an overall distance of eight points, but leading in the Data category. Both T-Mobile and KPN achieve the exceptional grade “outstanding“. ­Vodafone, with 5.1 million ­subscribers meanwhile the smallest operator in the country (T-Mobile: 6.7 million, 
KPN: 6.5 million), ranks third with the overall grade “very good“. In the Crowdsourcing, Vodafone leads with a margin of one point ahead of
T-Mobile.

In comparison to the results of the 2019 umlaut ­connect Mobile Benchmark in the Netherlands, which was published in spring 2019,
all three operators have lost some points. To some extent, this can be ex­plained with the very high performance level in the Netherlands. As we continually raise the thresholds of our scoring in ­order to keep track with the technological advancement, achieving score gains gets more and more difficult even for strong contenders. Also, our Crowdsourcing methodo­logy was significantly updated (see page 14), which makes the results in this discipline not 1:1 comparable to those of our previous benchmark. But what is most important this year: All three Dutch operators manage to provide very stable connections to their users – even in today‘s particularly demanding times.

NL2020_bar-charts_TotalScore_v1.png
NL2020_table_Overall-Results_v1.png

Conclusion.png
Best in Test NL 2021.png