This Data Challenge asks data science problems in the context of urban energy sustainability. More precisely, the focus of the challenge is on sensor data from a smart-building located on a university campus. The building is a so-called multi-tenant building, which means that it is used by different types of organizations. The Data Challenge addressed the following questions: Can we predict the occupancy in an efficient and reliable way for the lecture zones for different times of the day? Can we complete missing information in the sensor-overview by means of deriving patterns in sensor data and identify the zone in which the unknown sensors are mounted?
The building consists of multiple floors. Each floor contains different types of rooms. There are rooms used for lectures, rooms for project meetings and rooms that can be used for demonstrations of projects. Rooms are grouped in zones. Some of the zones, such as the main entrance hall, the stairs, the canteen and the toilets, are public zones. These zones can be accessed by everyone. A few of them, such as, e.g., the demonstration area on the ground floor, are used as a walk-through to other zones. There are also private zones, which can be accessed only by a single organization. Each zone contains one or more sensors, grouped in boxes and mounted on ceilings or walls. They provide data on temperature, movement, light, and/or CO2.
Sometimes topological information about sensors in a building happens to be incomplete or outdated. It can be that sensors are re-mounted, rooms could have been splitted, or the administration fails to be precise. In such situations it can happen that one does have one does have data coming from the sensors captured somewhere in the measurement database.
The Data Challenge addressed the following questions:
- Can we complete missing information in the sensor-overview by means of deriving patterns in sensor data and identify the zone in which the unknown sensors are mounted?
- Can we predict the occupancy in an efficient and reliable way for the lecture zones for different times of the day?
Although the data from building sensors indicates a notion of occupancy, there is neither a ground truth nor a unit of which we can express the sensor data directly into an amount of occupancy. One has to find a way to translate sensor data into a useful interpretation of occupancy(-density).
How can we interpret and translate the building sensor data in terms of occupancy?
The Data Challenge/hackathon described here is a first step toward answering the long-term goal of this research: predict energy consumption based on such sensor data, and make recommendations that can lead to reduction of the energy consumption, contributing to the efficient and sustainable use of smart buildings.
Links
- Detailed report of the winning solution with steps and graphs is provided here: Mansur Nurmukhambetov (nomomon.github.io)
- Similar data is available on the Kaggle platform.
Calendar
The Data Challenge was spread over 3 evening-workshops (Feb/April 2023), during which the organizers were sitting with the teams in the building itself (in that sense, this Data Challenge is some kind of hybrid challenge/hackathon more than a pure offline Data Challenge):
- March 16, 2023: 1st evening 17:00 – 21:00
- March 30, 2023: 2nd evening 17:00 – 21:00
- April13, 2023: 3rd evening 17:00 – 21:00