| title | Know Issue - Policy Server Mysql DB Failures when an ASG is Bound to More than 148 Spaces | |
|---|---|---|
| expires_at | never | |
| tags |
|
The policy-server pre-start fails when...
- upgrading to CF Networking Release version 3.68.0 or 3.69.0
- AND using MYSQL for the policy server DB
- AND dynamic ASGs are enabled
A fix will be included in CF Networking Release 3.70.0. For deployments that are using MYSQL DBs with dynamic ASGS enabled, we suggest skipping CF Networking Release 3.68.0 and 3.69.0 and upgrading to CF Networking Release 3.70.0 or higher.
{
"timestamp": "2025-05-01T10:45:10.131469969Z",
"level": "error",
"source": "cfnetworking.policy-server-migrate-db",
"message": "cfnetworking.policy-server-migrate-db.failed migrating and populating tags, retrying",
"data": {
"error": "perform migrations: executing migration: executor.Exec: Error 3906 (HY000): Exceeded max total length of values per record for multi-valued index staging_spaces_idx by 84 bytes. handling 82"
}
}
{
"timestamp": "2025-05-01T10:45:10.131469969Z",
"level": "error",
"source": "cfnetworking.policy-server-migrate-db",
"message": "cfnetworking.policy-server-migrate-db.failed migrating and populating tags, retrying",
"data": {
"error": "perform migrations: executing migration: executor.Exec: Error 3906 (HY000): Exceeded max total length of values per record for multi-valued index running_spaces_idx by 84 bytes. handling 83"
}
}
security_groups="$(cf curl /v3/security_groups)"
pages="$(echo ${security_groups} | jq .pagination.total_pages)"
for (( p=1; p<=${pages}; p++ ))
do
security_groups="$(cf curl /v3/security_groups?page=${p})"
echo "${security_groups}" | jq '[.resources[] | select(.relationships.staging_spaces.data | length >= 148)] | map({guid, name, staging_spaces_count: (.relationships.staging_spaces.data | length)})'
echo "${security_groups}" | jq '[.resources[] | select(.relationships.running_spaces.data | length >= 148)] | map({guid, name, running_spaces_count: (.relationships.running_spaces.data | length)})'
done
If any results are returned, then you will run into this bug and you should follow the mitigations. Below is an example of what results would look like from the script above.
[
{
"guid": "14ad7fc8-27c2-4456-9641-3d9f8cffb1c1",
"name": "too_many_staging_spaces_example",
"staging_spaces_count": 160
}
]
[
{
"guid": "14ad7fc8-27c2-4456-9641-3d9f8cffb1c1",
"name": "too_many_running_spaces_example",
"running_spaces_count": 170
}
]
- Connect to the policy server db.
- Run the following queries.
# for mysql
select name from security_groups WHERE JSON_LENGTH(staging_spaces) > 148;
select name from security_groups WHERE JSON_LENGTH(running_spaces) > 148;
If either of those queries return any rows, then you will run into this bug and you should follow the mitigations.
Migrations 82 and 83 both add functional indexes to the policy server database to make dynamic ASGs more performant. However, when the size of “staging_spaces” or “running_spaces” is too large the functional index will fail to be created, and thus the migration will fail. This causes the pre-start script to fail.
The “staging_spaces” and “running_spaces” columns become too large when a single ASG is bound to more than 148 individual spaces for that lifecycle.
You can force skip these migrations.
- Access the policy server db
- Add these rows manually so it will fake as if migrations 82 and 83 have run.
insert into gorp_migrations (id, applied_at) values (82, NOW());
insert into gorp_migrations (id, applied_at) values (83, NOW());
This is a safe procedure. The permanent fix has taken into account the fact that some DBs will be altered manually like this.
Global ASGs do not need to be bound to individual spaces. However, they can be bound unnecessarily to individual spaces, which will trigger this bug.
- Make and bind a new global ASG with all the same rules as the problematic ASG.
- Delete the problematic ASG.
- Do not bind the new ASG to spaces or orgs individually.
Instead of binding one ASG to 148+ spaces, make 2 identical ASGs and bind them to <149 spaces each.
If you have already deployed, or attempted to deploy, cf-networking-release version 3.68.0 or higher you can run the policy server migrations manually.
# commands run on diego_database bootstrap VM
# become root
sudo su -
# make sure you are on the bootstrap VM, if this file is empty then you are on the wrong VM
cat /var/vcap/jobs/policy-server/bin/pre-start
# run the pre-start script. It will log output and will migrate the db
/var/vcap/jobs/policy-server/bin/pre-start