Federated Learning in Private Computing

Federated Learning in Private Computing, 해시게임 data assets have become a key tool in product and service design, However, the centralized collection of user data puts personal privacy at risk.

This in turn exposes the organization to legal risks.

Since 2016, people began to explore how to use the ownership and origin of data under the protection of user privacy,

This makes federated learning and federated analysis a hot spot of attention.

As the scope of research continues to expand, federated learning has begun to be applied to wider fields such as the Internet of Things.

So, what is federated learning?

Federated learning is coordinated by a central server or service provider,

A machine learning setting in which multiple entities collaborate to solve a machine learning problem.

The raw data of each client is stored locally and is not exchanged or transmitted;

Instead, the goal of learning is achieved using focused data updates for instant aggregation.

Similarly, to generate analytical insights from the combined information of scattered datasets,

Called federated analysis, the same scenarios encountered in federated learning apply to the federated analysis.

This paper briefly introduces key concepts in federated learning and analytics,

Highlights how to integrate privacy technologies with real-world systems,

and how these technologies can be used to gain social benefits from aggregated statistics in new areas,

and minimize risk to individuals and data custodians.

Privacy Protection and Federated Learning

Privacy is essentially multiple concepts,

There are three key components:

Transparency and user consent;

data minimization;

and the anonymization of data.

Transparency and user consent are the foundation of privacy protection:

They are the way users understand and approve the use of their data.

Privacy-preserving technologies cannot replace transparency and user consent,

But it’s easier to infer which types of data can be used or excluded by design,

This makes the privacy statement easier to understand, verify and enforce.

The goals of data usage are mainly to produce federated learning models and compute metrics or other aggregated statistics on user data.

Data minimization applied to aggregation involves collecting only the data needed for a particular computation,

Restrict access to this data at all stages, process personal data as early as possible, and keep data to a minimum.

That is, data minimization means limiting access to all data to the smallest possible group of people,

Usually implemented through security mechanisms, such as encryption, access control, and secure multi-party computation and trusted execution environments.

Data anonymization means that the final output of the computation does not reveal anything unique to the individual.

When used for anonymous aggregation, the data provided to the computation by any individual user has little effect on the final aggregated output.

For example, when aggregated statistics are released to the public,

Aggregated statistics, including model parameters, should not differ significantly depending on whether or not data for a specific user is included in the aggregate.

That is, data minimization involves the execution of computations and processing of data, while data anonymization involves computations and what is published.

Federated learning is structurally data-minimized.

It is important to note that data collection and aggregation are inseparable in a federated approach,

Client data is transformed and collected for immediate aggregation,

And analysts don’t have access to every client’s message.

Federated learning and federated analysis are examples of general federated computing patterns that embody data minimization practices.

The traditional approach is centralized processing, i.e. replacing on-device preprocessing and aggregation with data collection,

During the processing of log data, the minimization of data occurs on the server.

The goals of federated learning and federated analysis are consistent with those of anonymous aggregation.

With machine learning, the goal is to train a model that accurately predicts all users without overfitting.

Likewise, for statistical queries, the goal is to estimate statistics, which should also not be affected too much by any one user data.

Federated learning is combined with privacy-preserving technologies such as differential privacy,

The published aggregates are guaranteed to be sufficiently anonymous.

In many cases, data anonymization may not be applicable and direct access to an individual’s sensitive data by service providers is unavoidable,

But in these interactions, the service provider should only use the data for the intended purpose.

Essentials of Federated Learning

Federated learning is characterized by keeping the original data decentralized and learning through aggregation.

Locally generated data is heterogeneous in distribution and quantity,

This differentiates federated learning from traditional data center-based distributed learning environments,

The latter data can be arbitrarily distributed and cleaned,

Any node in the computation can access any data.

In practice, the role of the control center is significant, and often necessary,

For example mobile devices that lack a fixed IP address and need a central server to communicate.

2.1 Typical scenarios and applications

Two federal scenarios received particular attention:

Cross-device federated learning, where the clients are a large number of mobile devices or IoT devices.

For federated learning across organizations, the client is usually a smaller organization, institution, or another data silo.

Table 1, adapted from Kairouz et al., 10 summarizes the key features of the FL setting,

It also highlights some key differences between cross-device and cross-silo setups and contrasts with distributed learning in a data center.

Federated learning across devices has been used on Android and iOS phones, respectively, for many applications, such as keyboard prediction.

Federated learning across organizations is used for problems such as health research.

Another emerging application is finance,

Investments from WeBank, Credit Suisse, Intel, and others.

2.2 Federated Learning Algorithm

Machine learning, especially deep learning, is generally data-hungry and computationally intensive.

Therefore, the feasibility of jointly training quality models is far from reaching the predetermined conclusion.

The federated learning algorithm is based on the classic stochastic gradient descent algorithm,

This stochastic gradient descent algorithm is widely used to train machine learning models in traditional settings.

The model is a function from training samples to predictions, parameterized by a vector of model weights,

and a loss function that measures the error between the prediction and the true output.

Calculate the average gradient of the loss function with respect to the model weights by sampling a batch of training samples (usually from tens to thousands),

The model weights are then adjusted in the opposite direction of the gradient.

By appropriately adjusting the step size at each iteration, even for non-convex functions,

Satisfactory convergence can also be obtained.

Extending to federated learning to broadcast the current model weights to a random set of clients,

Let them each compute the gradient of the loss on the local data,

Average these gradients on the client on the server,

Then update the global model weights.

However, many more iterations are usually required to produce a model with high accuracy.

Rough calculations show that in a federated learning setting, one iteration can take several minutes,

This means that federal training can take anywhere from a month to a year, beyond the realm of practicality.

The key idea of ​​federated learning is intuitive, reducing communication and startup costs by performing multiple steps of stochastic gradient descent locally on each device,

Then reduce the average number of model updates.

If the model is averaging after each local step, it may be too slow;

If the model takes too few averages, it can diverge, and averaging can produce a worse model.

Model training can be reduced to an application of federated aggregation, i.e. model gradients or updated averages.

2.3 Typical Workflow

Having a working federation algorithm is a necessary starting point,

But more is needed if federated learning across devices is to be an effective way to drive product teams.

For cross-device federated learning, a typical workflow is usually as follows:

(1) Identify the problem

Usually, this means needing a model on a medium-sized (1-50MB) device;

The underlying training data available on the device is richer or more representative than the data available in the data center;

There are privacy or other reasons to prefer not to centralize data;

The feedback signals needed to train the model are readily available on the device.

(2) Model development and evaluation

As with any machine learning task, choosing the right model structure and hyperparameters (learning rate, batch size, regularization) is critical to machine learning success.

In federated learning, the challenge can be even greater, which introduces many new hyperparameters,

For example, the number of clients participating in each round, how many local steps need to be performed, etc.

A common starting point is to simulate using federated learning based on proxy data available in the data center,

And rough model selection and tuning.

Final adjustments and evaluations must be performed using federated training on real devices.

The evaluation must also be performed in a joint manner: independent of the training process,

The candidate global model is sent to the device,

so that accuracy metrics can be computed on the local datasets of these devices and aggregated by the server,

For example, simple averages and histograms of performance per client are important.

These demands create two key infrastructure requirements:

(1) Provide a high-performance federated learning simulation infrastructure that allows for a smooth transition to running on real devices;

(2) A cross-device infrastructure that makes it easy to manage multiple simultaneous training and evaluation tasks.

(3) Deployment

Once a high-quality candidate model is selected in step 2,

The deployment of this model typically follows the same procedure as the data center training model,

includes additional validation and testing (possibly including manual quality assurance),

Field A/B testing to compare with previous production models,

As well as a phased rollout to an entire fleet of devices (potential orders of magnitude more devices than actually participating in model training).

It is worth noting that all the work in step 2 has no impact on the user experience of the devices involved in training and evaluation;

Models trained with federated learning do not let users see predictions,

Unless they complete the deployment steps.

Ensuring that this processing does not negatively impact equipment is a key infrastructure challenge.

For example, intensive computations may only be performed when the device is idle and the network is idle.

These workflows represent a significant challenge for building scalable infrastructure and APIs.

Luggage Size

Carry on Luggage

If you are flying, you will need to know the rules and regulations about carry on luggage. Some airlines allow only a certain size and weight for carry on luggage. You should also know the rules about personal items, including batteries and electronic cigarettes. If you are traveling with a large carry on, you may want to valet it before boarding the plane.

Size

The size of your carry-on luggage should be determined by the type of trip you are taking. A small bag is appropriate for a short trip, while a medium or large bag is suitable for longer travels or multiple people traveling together. Sizes are usually listed in inches, but keep in mind that most internal storage compartments are much smaller. The dimensions are taken from the bottom of the bag to the top of the retracted handle, so check the dimensions carefully before purchasing.

Sizes vary from airline to airline, so check with the airline to be sure yours will fit. Some airlines have stricter guidelines than others, so it’s always a good idea to check their rules before booking your trip. Keep in mind that international airlines have stricter rules about the size of carry-on bags than domestic airlines.

The size of carry-on luggage must be less than 23 inches in diameter, although some airlines have stricter regulations. Carry-on luggage should also fit in the overhead compartment of the airplane, but if you need more room, you may have to check in the item with a flight attendant. However, most airlines allow a maximum of one carry-on bag per passenger.

Weight

Weight is an important consideration when purchasing carry on luggage. You need to be able to manage your luggage on board and you should also be sure that the size meets airline requirements. 셀프스토리지 Typically, carry-on luggage must measure between 55 cm by 35 cm and 25 cm high. It should also be lightweight to allow passengers to handle it easily.

Most airlines will allow you to bring two carry-on pieces of luggage on board. However, some airlines may restrict the weight of carry-on items. Some airlines, particularly in Asia and Australia, restrict carry-on luggage to between five to twelve kilograms. It’s better to stick to the standard for your airline, because it will protect you from potential issues with overweight luggage.

Check the airline’s weight and size limits before checking in. If your carry-on bag weighs more than the maximum allowed by the airline, you may have to pay extra charges. Fortunately, there are some savvy tricks to avoid this. First, you must weigh your carry-on luggage. Make sure to include the weight of wheels, handles, and linear dimensions.

Personal items allowed

Personal items are allowed on most airlines, but there are certain requirements. In most cases, these items must fit underneath the seat in front of you. You may be able to bring along a laptop, small handbag, or purse. As long as it is not too large, you should be fine.

Many people use a personal item as a backup to their carry-on travel bag. However, size restrictions vary widely between airlines. Most airlines allow small purses, backpacks, briefcases, and laptop bags as personal items. There are also a few airlines that allow only one personal item per person.

When traveling with United Airlines, personal items are allowed as long as they fit underneath the seat in front of you. The size of these items must be less than 17 inches in diameter, or 45 cm in length. Other items you may take are jackets, umbrellas, diaper bags, and food purchased in the airport. Some carriers also allow you to bring along a camera or reading material.

Airlines that allow carry-on luggage

Carry-on luggage allowances vary widely from airline to airline. However, many allow as many as two pieces. In most cases, the maximum size is 40 cm x 30 cm x 15 cm, but airlines may allow up to twice this size. Some airlines have more generous rules, and some don’t even impose weight limits. 셀프스토리지

Delta Air Lines, for example, allows one carry-on for free. However, you must remember to check your other items. If your carry-on luggage is too large, the flight attendant may ask you to check it. If you’re flying Delta, you may want to pack smaller items to maximize space.

Most airlines have size limits for carry-on luggage, so it’s important to check before you leave. Larger suitcases may have to be gate-checked, which can cost you a lot of money. Also, it’s important to keep in mind that some airlines allow as much as 24 inches of carry-on luggage.