Deploying GLM-4.7 on Modal GPUs: open-weights vs API-only approaches