Mastering Data Binning with Swift Charts

Published on

Data binning is a commonly used data processing technique that typically divides continuous numerical or temporal data into multiple intervals (which are mostly adjacent and non-overlapping). This method not only covers the entire data range but also provides clear demarcation for data points within each interval. By performing data binning, we can analyze, visualize, and statistically process complex datasets more effectively. This article will explore how to use the advanced APIs provided by Swift Charts to achieve precise and efficient data binning.

The Magic of Data Binning in Swift Charts

In my exploration of Swift Charts, I was not only focused on how to use it but also fascinated by its underlying design philosophy and implementation techniques. As an excellent declarative charting framework, Swift Charts offers powerful features that traditional charting frameworks often lack. In most cases, developers need only make simple declarations, and the framework can automatically handle complex chart drawing tasks.

Notably, when dealing with numerical or date-type axis data, Swift Charts can automatically generate continuous and non-overlapping intervals based on preset parameters. This feature often requires developers to implement it themselves in other charting frameworks, but Swift Charts greatly simplifies this process.

Let’s demonstrate the application of data binning in Swift Charts through a specific example:

Swift
let demoData: [ChartData] = [
  ChartData(date: Calendar.current.date(byAdding: .month, value: 0, to: Date())!, sales: 150, profit: 120),
  ChartData(date: Calendar.current.date(byAdding: .month, value: 1, to: Date())!, sales: 200, profit: 140),
  ChartData(date: Calendar.current.date(byAdding: .month, value: 3, to: Date())!, sales: 180, profit: 130),
  ChartData(date: Calendar.current.date(byAdding: .month, value: 4, to: Date())!, sales: 170, profit: 160),
]

struct ChartDemo: View {
  var body: some View {
    Chart {
      ForEach(["Sales", "Profit"], id: \.self) { series in
        ForEach(demoData) { entry in
          LineMark(
            x: .value("Date", entry.date), // X-Axis: Automatically perform binning on date data
            y: .value(series, series == "Sales" ? entry.sales : entry.profit)
          )
          .symbol(by: .value("Type", series))
          .foregroundStyle(by: .value("Type", series))
        }
      }
    }
    .chartXAxis {
      // X-Axis: Automatically display bins by month
      AxisMarks(values: .stride(by: .month)) { date in
        AxisValueLabel(format: .dateTime.month()) // Date is binned by month
        AxisGridLine()
      }
    }
    .chartYScale(domain: [50.0, 250]) // Set the data range for Y-Axis
    .chartYAxis {
      // Y-Axis: Display tick marks at specified positions
      AxisMarks(values: [50, 150, 250]) { value in
        AxisGridLine()
        AxisValueLabel()
      }
    }
    .aspectRatio(1, contentMode: .fit)
    .padding(32)
  }
}

image-20240914101809474

In this example, we only need to provide basic data, and Swift Charts can automatically generate corresponding time ticks on the X-axis and interval them according to the specified date units. Similarly, on the Y-axis, we adjusted the data display range and displayed ticks at specific positions. Behind these features, it’s the data binning technology quietly playing its role.

Although the basic concept of data binning seems simple, implementing an efficient and stable binning algorithm faces many challenges. Fortunately, Swift Charts provides us with APIs specifically for data binning: NumberBins and DateBins. This means that even if we don’t use it to build charts, we can easily leverage these APIs to achieve efficient and stable data binning and apply it to other diverse scenarios. This design not only enhances the flexibility of the framework but also provides developers with powerful tool support for handling various data analysis tasks.

NumberBins: A Tool for Number Binning

NumberBins is a powerful numerical binning tool provided by Swift Charts, suitable for integers and floating-point numbers. Let’s explore its basic usage through an example:

Swift
let data = [100, 200, 250, 420, 500]
let bins = NumberBins(thresholds: data)

print(bins.thresholds) // Output data thresholds: [100, 200, 250, 420, 500]

// Iterate and display the binned data intervals
for value in bins {
    print(value)
}

print("count:", bins.count) // Output number of intervals: 4
print("index:", bins.index(for: 300)) // Determine which interval 300 belongs to, returns index 2 (corresponding to 250..<420)

// Output result
[100, 200, 250, 420, 500]
ChartBinRange<Int>(lowerBound: 100, upperBound: 200, isClosed: false)
ChartBinRange<Int>(lowerBound: 200, upperBound: 250, isClosed: false)
ChartBinRange<Int>(lowerBound: 250, upperBound: 420, isClosed: false)
ChartBinRange<Int>(lowerBound: 420, upperBound: 500, isClosed: true)
count: 4
index: 2

After creating a NumberBins instance, we can obtain the following information:

  • thresholds: Data thresholds
  • Get the binned data intervals by iteration
  • Use the index(for:) method to check which interval a given data belongs to

It is important to note that the index(for:) method will always return an index. Even if the queried value does not belong to any actual bin range, NumberBins will still return a virtual index. Therefore, it is recommended to check whether the value falls within the defined bin range before performing the query to avoid unexpected results.

NumberBins provides multiple constructors, each applying slightly different rules when creating intervals:

image-20240914103741529

  1. NumberBins(thresholds:) constructs intervals strictly according to the given thresholds and order, without automatically adjusting the order or removing duplicates.

    Swift
    let data = [200, 100, 200, 250]
    let bins = NumberBins(thresholds: data)
    // 200..<100, 100..<200, 200...250
  2. NumberBins(data:desiredCount:minimumStride:) decides how to split the data based on desiredCount and minimumStride, and may expand the overall data range.

    Swift
    // When desiredCount is nil, it uses Scott's normal reference rule to automatically calculate the number of bins
    let data = [100, 200, 800, 400, 500]
    let bins = NumberBins(data: data) 
    // thresholds: [0, 500, 1000] 
    
    // Try to bin according to the desiredCount
    let bins = NumberBins(data: [-100, 200, 800, 400, 500], desiredCount: 5)
    // Result: [-200, 0, 200, 400, 600, 800]
    
    // Prioritize satisfying the minimum stride
    let bins = NumberBins(data: [-100, 200, 800, 400, 500], desiredCount: 5, minimumStride: 400)
    // Result: [-500, 0, 500, 1000]
  3. NumberBins(range:desiredCount:minimumStride:) is similar to the data version but based on a given range:

    Swift
    let bins = NumberBins(range: 10...100, desiredCount: 3)
    // Result: [0, 25, 50, 75, 100]
    // Note: Although desiredCount is 3, NumberBins chooses a number of bins that is more suitable for display, ensuring that each bin has consistent length
    
    // Ensure each bin is not smaller than minimumStride
    let bins = NumberBins(range: -50...100, minimumStride: 30)
    // Result: [-50, 0, 50, 100]
  4. Use size to set a fixed bin size:

    Swift
    // To ensure consistent bin sizes, automatically extends both ends
    let bins = NumberBins(size: 30.0, range: -100.0...200.0)
    // Result: [-120.0, -90.0, -60.0, -30.0, 0.0, 30.0, 60.0, 90.0, 120.0, 150.0, 180.0, 210.0]

As you can see, NumberBins is not just simply splitting data; it needs to consider various factors comprehensively, especially ensuring that the binned data is suitable for display. This intelligent binning method makes NumberBins a powerful tool for handling numerical data.

DateBins: A Tool for Date Binning

DateBins is similar in usage and concept to NumberBins. As a date binning tool, its biggest feature is the ability to intelligently bin dates based on specified calendar units or time intervals. Let’s explore its functionality through examples:

Swift
// Set up a calendar for the China time zone
var calendar = Calendar(identifier: .gregorian)
calendar.timeZone = TimeZone(identifier: "Asia/Shanghai")!

// Define date range: 2024/5/5 - 2024/10/15
let startDate = calendar.date(from: DateComponents(year: 2024, month: 5, day: 5))!
let endDate = calendar.date(from: DateComponents(year: 2024, month: 10, day: 15))!

// Bin by "month"
let bins = DateBins(unit: .month, range: startDate...endDate, calendar: calendar)

let dateFormatter = DateFormatter()
dateFormatter.calendar = calendar
dateFormatter.timeZone = calendar.timeZone
dateFormatter.dateFormat = "yyyy-MM-dd HH:mm:ss Z"

// Output thresholds (bin points)
print("Thresholds")
for threshold in bins.thresholds {
  print(dateFormatter.string(from: threshold))
}

// Output date ranges
print("Range:")
dateFormatter.dateFormat = "yyyy-MM-dd"
for value in bins {
  let rangeType = value.upperBound == bins.thresholds.last ? "..." : "..<"
  print(
    "\(dateFormatter.string(from: value.lowerBound))\(rangeType)\(dateFormatter.string(from: value.upperBound))")
}

// Output:
Thresholds
2024-05-01 00:00:00 +0800
2024-06-01 00:00:00 +0800
2024-07-01 00:00:00 +0800
2024-08-01 00:00:00 +0800
2024-09-01 00:00:00 +0800
2024-10-01 00:00:00 +0800
2024-11-01 00:00:00 +0800
Range:
2024-05-01..<2024-06-01
2024-06-01..<2024-07-01
2024-07-01..<2024-08-01
2024-08-01..<2024-09-01
2024-09-01..<2024-10-01
2024-10-01...2024-11-01

In addition to using calendar units, we can also bin based on specific time intervals:

Swift
let timeInterval: TimeInterval = 3 * 24 * 60 * 60 // 3 days
let bins = DateBins(timeInterval: timeInterval, range: startDate...endDate)

// Partial output:
2024-05-05 00:00:00 +0800
2024-05-08 00:00:00 +0800
2024-05-11 00:00:00 +0800
...

This flexibility of DateBins makes it an ideal tool for handling time series data. Whether you need to bin by month, week, or day, or require custom time intervals, DateBins can easily handle it. This is particularly useful for creating time-related charts, reports, or data analysis tasks.

It’s worth noting that DateBins automatically handles complex calendar logic, such as leap years, the number of days in different months, and special rules of various calendar systems (like the Gregorian calendar, Lunar calendar, Islamic calendar, etc.). This intelligent handling ensures that the binning results remain accurate and consistent across different cultural and regional contexts. By leveraging Apple’s calendar framework, DateBins can adapt to date representations worldwide, greatly simplifying the workload for developers when dealing with cross-cultural and cross-regional date ranges.

Conclusion

This article explored two powerful data binning tools in the Swift Charts framework: NumberBins and DateBins. These tools not only provide developers with efficient and stable data binning capabilities but also have multiple advantages:

  1. Accuracy: They can intelligently handle various complex situations, such as special date rules and numerical boundary cases.
  2. Flexibility: Whether dealing with numerical or date data, these tools provide multiple binning methods to meet the needs of different scenarios.
  3. System Integration: As part of Swift Charts, these APIs are already integrated into many of Apple’s system frameworks. Using them doesn’t increase the app size, making it a “zero-cost” performance optimization.
  4. Cross-Cultural Adaptation: Especially DateBins, which can adapt to date representations worldwide, providing strong support for internationalized applications.

It’s noteworthy that among the many frameworks provided by Apple, there are many similar treasure tools waiting for developers to discover and utilize. These tools can not only improve development efficiency but also ensure deep integration of applications with the Apple ecosystem.

In future articles, we will continue to explore and introduce more such practical tools and APIs. By deeply understanding and cleverly utilizing these system-level resources, we can build more efficient and smoother applications.

Get weekly handpicked updates on Swift and SwiftUI!