TESTING METHODOLOGY 

THE METHODOLOGY OF THE P3 CONNECT MOBILE BENCHMARK IS THE RESULT OF MORE THAN 15 YEARS OF TESTING MOBILE NETWORKS. TODAY, NETWORK TESTS ARE CONDUCTED IN MORE THAN 80 COUNTRIES. OUR METHODOLOGY WAS CAREFULLY DESIGNED TO EVALUATE AND OBJECTIVELY COMPARE THE PERFORMANCE AND SERVICE QUALITY OF MOBILE NETWORKS FROM THE USERS’ PERSPECTIVE. 

The P3 connect Mobile Benchmark Australia comprises of the results of extensive voice and data drivetests and walktests as well as a sophisticated crowdsourcing approach. 

DRIVETESTS AND WALKTESTS

The drivetests and walktests in Australia took place from October 29 to November 16, 2018 All samples were collected during the day, between 8.00 a.m. and 10.00 p.m. The network tests covered inner-city, outer metropolitan and suburban areas. Measurements were also taken in smaller towns and on the connecting highways. The four measurement cars together covered about 4,280 kilometres in the cities, about 940 km in towns and about 8,000 km on the roads – resulting in a total of 13,220 kilometres. The combination of test areas has been selected to provide representative test results across the Australian population. The areas selected for the 2018 test account for more than 15.1 million people, or roughly 67 per cent of the total population of Australia. 
The tests were conducted according to P3‘s “Large Country“ model. This means that the measurements focused on metro­politan agglomerations and their closer environments. The routes and all visited cities and towns are shown here. The four drivetest cars were equipped with Samsung Galaxy S8 smartphones. The simultaneous measurement of voice and data services was conducted with these mass market devices to obtain a realistic picture of the users‘ experience.

VOICE TESTING

One smartphone per operator in each car was used for the voice tests, setting up test calls from one car to another. The walktest team also carried one smartphone per operator for the voice tests. In this case, the smartphones called a stationary counterpart.The audio quality of the calls was evaluated using the HD-voice capable and ITU standardised POLQA wideband algorithm.
All smartphones used for the voice tests were set to VoLTE preferred mode. In networks or areas where this modern 4G based voice technology was not available, they would perform a fallback to 3G.In order to account for typical smartphone use during the voice tests, background data traffic was generated through random injection of small amounts of HTTP traffic. The voice scores account for 34 per cent of the total results.

DATA TESTING

Data performance was measured by using three more Galaxy S8 in each car – one per operator. Their radio access technology was set to LTE preferred mode.

For the web tests, they accessed web pages according to the widely recognised Alexa ranking In addition, the static Kepler test web page as specified by ETSI was used. In order to test the data service performance, files of 3 MB and 1 MB for download and upload were transferred from or to a test server located on the Internet. In addition, the peak data performance was tested in uplink and downlink directions by assessing the amount of data that was transferred within a seven seconds time period.

The evaluation of YouTube playback takes into account that YouTube dynamically adapts the video resolution to the available band width. So, in addition to success ratios, start times and playouts without interruptions, the measurements also determined average video resolution. All the tests were conducted with the best-performing mobile plan available from each operator. Data scores account for 51 per cent of the total results.

CROWDSOURCING

Additionally, P3 conducted crowd-based analyses of the Australian networks which contribute 15 per cent to the end result. They are based on data that gathered in August, September and October, 2018. For the collection of crowd data, P3 has integrated a background diagnosis processes into 800+ diverse Android apps. If one of these applications is installed on the enduser’s phone and the user authorizes the background analysis, data collection takes place 24/7, 365 days a year. Reports are generated for every quarter of an hour and sent daily to P3‘s cloud servers.

Such reports generate just a small number of bytes per message and do not include any personal user data. Interested parties can deliberately take part in the data gathering with the specific ”U get“ app (see box on the right).

NETWORK COVERAGE

For the assessment of network coverage, P3 lays a grid of 2 by 2 kilometres over the whole test area. The “evaluation areas“ generated this way are then sub-divided into 16 smaller tiles. To ensure statistical relevance, P3 requires a certain number of users and measurement values per operator for each tile and each evaluation area. If these thresholds are not met by one of the operators, this part of the map will not be considered in the assess­ment for the sake of fairness.

 “Quality of Coverage“ reveals whether voice and data services actually work in an evaluation area. P3 does this because not in each area that allegedly provides network reception, mobile services can actually be used. We specify these values for the coverage of voice services (3G and 4G combined), data (3G and 4G combined) and 4G only.

DATA THROUGHPUTS

Additionally, P3 investigates the data rates that were actually available to each user. For this purpose, we determine the best obtained data rate for each user during the evaluation period and then calculate their average value. In addition, we determine the so-called P90 values for the top throughput of each evaluation area as well as of each user‘s best throughput. P90 values specify the threshold in a statistical distribution, below which 90 per cent of the gathered values are ranging and  depict how fast the network is under favorable conditions.

DATA SERVICE AVAILABILITY

This parameter indicates the number of outages or service degradations – events where data connectivity is impacted by a number of cases that significantly exceeds the expectation level. To judge this, the algorithm looks at a sliding window around the hour of interest. This ensures that we only consider actual degradations as opposed to a simple loss of network coverage due to prolonged indoor stays or similar reasons. In order to ensure statistical relevance, each operator must have sufficient statistics for trend and noise analyses per each evaluated hour. The exact number depends on the market size and number of operators. A valid assessment month must comprise of at least 90 per cent of valid assessment hours. Deviating from the other crowd score elements, Data Service Availability is rated based on a six-month observation period – in this case from May to Oct 2018.

Two boxes were mounted into the rear and side windows of each measurement car in order to support eight smartphones per car.

Two boxes were mounted into the rear and side windows of each measurement car in order to support eight smartphones per car.

One Samsung Galaxy S8 per operator took the voice measurements and one additional S8 per operator was used for the data tests. All test phones were operated and supervised by P3‘s unique control system.

One Samsung Galaxy S8 per operator took the voice measurements and one additional S8 per operator was used for the data tests. All test phones were operated and supervised by P3‘s unique control system.

 
Scorebreakdown_Drive_Walk_Crowd_englisch.png
 
Evaluation-2017-2018.png
U-get-Mockup-Homescreen.jpg

PARTICIPATE IN OUR CROWDSOURCING

Everybody interested in being a part of our global crowdsourcing panel and obtaining insights into the reliability of the mobile network that her or his smartphone is logged into, can most easily participate by installing and using the “U get“ app. This app exclusively concentrates on network analyses and is available under http://uget-app.com. “U get“ checks and visualises the current mobile network performance and contributes the results to our crowdsourcing platform. Join the global community of users who understand their personal wireless performance, while contributing to the world’s most comprehensive picture of mobile customer experience.



CONCLUSION

TELSTRA TAKES BACK THE “BEST IN TEST” ACCOLADE FROM OPTUS, BUT BOTH OPERATORS ARE “VERY GOOD”. VODAFONE CONFIRMS ITS POSITION WITH OVERALL GOOD RESULTS.

As in previous years, the two largest Australian operators fought a close race for the top rank. In this year’s P3 connect Mobile Benchmark Australia, the overall winner is Telstra – taking back the top rank from last year’s winner Optus. Ultimately, it was Telstra’s very good crowd results that tipped the scales. Also, Telstra achieved minimally better voice results, while Optus managed to gain more points in the data discipline.  In the score for the classical drivetest and walktests categories, Optus leads ahead of Telstra. 
A direct comparison to last year’s results requires adapting the maximum point numbers due to the addition of the new crowd score points in this year. This examination reveals that Telstra improved its scores both in the voice and data disciplines, while Optus lost a small amount of points in the voice discipline, but improved in the data category. Vodafone achieved slightly lower results both in the voice and data assessments compared to the previous year. 
The fact that two out of three operators managed to improve even though we have raised the bar at various parts of our methodology shows the consideable efforts that the operators put into the upgrades of their networks. But it is also a perfect proof for our claim that our demanding network benchmarks contribute to the constant improvement of mobile networks all over the world.

AUS_TotalScore2018_englisch_neu.png
AUS_Tabelle_TotalScore2018_Eng_V2.png
AUS_2018_Conclusion_rev.png